Calling Bullshit: Data Reasoning in a Digital World
The world is awash in bullshit. Politicians are unconstrained by facts. Science is conducted by press release. Higher education rewards bullshit over analytic thought. Startup culture elevates bullshi...
Continuing my tour of books I should have already read, Calling Bullshit by @carlbergstrom.com and @jevinwest.bsky.social. Just a delight - an accessible, entertaining, insightful look at various forms of BS. Much like Weapons of Math Destruction, would love a 2025 update of this one.
07.10.2025 00:55 โ ๐ 364 ๐ 71 ๐ฌ 11 ๐ 6
How I, a non-developer, read the tutorial you, a developer, wrote for me, a beginner - annie's blog
โHello! I am a developer. Here is my relevant experience: I code in Hoobijag and sometimes jabbernocks and of course ABCDE++++ (but never ABCDE+/^+ are you kidding? ha!) and I like working with ...
"How I, a non-developer, read the tutorial you, a developer, wrote for me, a beginner" by Annie Mueller ๐
๐ ๐ญ
anniemueller.com/posts/how-i-...
23.09.2025 07:57 โ ๐ 323 ๐ 95 ๐ฌ 15 ๐ 30
BIG NEWS! We've updated the website of the Open Visualization Academy, where you can see all its contributors: openvisualizationacademy.org
This is the announcement in our newsletter: openvisualizationacademy.beehiiv.com/p/we-re-back...
#dataViz #infographics #dataJournalism #dataVis ๐
1/x
25.09.2025 18:05 โ ๐ 133 ๐ 40 ๐ฌ 4 ๐ 4
Screenshot of first page of slidecrafting-book.com website
I'm exited to announce a new resource about making slides with quarto and revealjs. This book is the combination of all the work I have done in this area, reordered and polished up
There isn't a lot of new information yet, but this format allows me to add more easily
slidecrafting-book.com
#quarto
24.09.2025 16:12 โ ๐ 179 ๐ 64 ๐ฌ 11 ๐ 6
One mother for two species via obligate cross-species cloning in ants - Nature
In a case of obligate cross-species cloning, female ants of Messor ibericus need to clone males of Messor structor to obtain sperm for producing the worker caste, resulting in males from the same mother having distinct genomes and morphologies.
It's never occurred to me that it IS an assumption. This is the most astonishing start to a paper I've read in years:
"Living organisms are assumed to produce same-species offspring. Here, we report a shift from this norm in Messor ibericus, an ant that lays individuals from two distinct species."
24.09.2025 17:26 โ ๐ 451 ๐ 101 ๐ฌ 19 ๐ 22
For instance, yesterday I read a paper with a table describing participants' sickness absence days with a mean of 71 and SD = 88. Generating a random (gaussian) sample using these values produces ~20% participants with less than zero sick days.
22.09.2025 13:24 โ ๐ 3 ๐ 1 ๐ฌ 2 ๐ 2
Review: Dashboards That Deliver, Nightingale
Dashboards That Deliver: How to Design, Develop, and Deploy Dashboards That Work, the upcoming book by Andy Cotgreave, Amanda Makulec, Jeffrey Shaffer, and...
๐ The new book Dashboards That Deliver is a tour de force of data visualization and project management expertise.
Emilia Ruzicka reviews how the authors lay bare their expertise and process for everyone to benefit from their decades of experience.
nightingaledvs.com/review-dashb...
22.09.2025 14:12 โ ๐ 8 ๐ 1 ๐ฌ 0 ๐ 1
Barcode plots showing the distribution of age of grandmaster, international master, fide master, and candidate master for male and female chess players. Age seems to be less related to title for male players.
For this week's #TidyTuesday chess player rating data, I made an annotated barcode plot to show the distribution of age by title โ๏ธ It was hard to set a good transparency level for the lines since there's such a difference between the number of male and female players ๐
#RStats #DataViz #ggplot2
22.09.2025 10:39 โ ๐ 37 ๐ 6 ๐ฌ 5 ๐ 1
Chartle - A daily chart game
Guess the country in red by analysing today's chart
Launch day ๐
Weโve just released @chartlecc.bsky.social - a daily chart game!
Your job is to guess which country is represented by the red line in today's chart. You get 5 tries, no other clues!
Play today, come back tomorrow for a different chart with new data and share with your chart friends ๐
12.09.2025 13:41 โ ๐ 113 ๐ 50 ๐ฌ 13 ๐ 24
DataBS Conf
"Data, Behind the Scenes" is a free-to-attendย online-only, single trackย conference centered on the real stories of data work from the folks in the trenches. Weโre not here for the latest AI hype, perf...
#DataBS Conf 2025 preshow! We have two talks that we couldn't fit into the schedule but the speakers pre-recorded their talk for us to share before the main event next week!
Both are really good and give me lots of excitement about what we'll see next week.
ti.to/databsconf/d... <- free tix
๐งต1/3
18.09.2025 15:43 โ ๐ 10 ๐ 8 ๐ฌ 1 ๐ 4
Shannon's slides are always so unbelievably clear and helpful!!!
github.com/shannonpileg...
I'm having "Ohhhhh that's what that means" moments every 10 seconds here.
#positconf2025
18.09.2025 15:09 โ ๐ 37 ๐ 15 ๐ฌ 2 ๐ 0
This is pretty cool: UDFs went into #PowerBI yesterday; and today I'm using them in a non-trivial manner in an actual report.
@jaypowerbi.bsky.social Good job if this is you.
17.09.2025 23:59 โ ๐ 7 ๐ 1 ๐ฌ 1 ๐ 0
Grid of ternary plots showing the percentage of fat, carbohydrates, and protein for different recipes on allrecipes.com, split by Italian, Cuban, French, Greek, Lebanese, and Japanese cuisines. Italian shows no points with high level of fat.
I normally only see these ternary plots used to show UK election results, but decided to see if I could make them work for this week's #TidyTuesday data from Allrecipes ๐ A little bit tricky to add annotations to but overall, I quite like the result!
#RStats #ggplot2 #DataViz
15.09.2025 12:49 โ ๐ 63 ๐ 5 ๐ฌ 4 ๐ 2
A scatterplot on a cream white background with the title "The chicken ๐ or the egg ๐ฅ?". The square grid is split by four reddish arrows pointing outwards north/south/east/west. A textbox in the top left reads "Based on 2,218 recipes categorized by the cuisine (country, region, culture), this graph shows the proportional frequency of the ingredients chicken vs. egg (x-axis) & butter vs. oil (y-axis) mentioned across recipes of each cuisine". Each point and text in the plot indicates where on a scale from chicken (left) to egg (right) and butter (top) to oil (bottom) the cuisine is located. Turkish is found at the very center of the plot, Chinese and Thai in the bottom left (chicken & oil), whereas Austrian/German/Swiss and particularly Scandinavian are in the top right (egg & butter). An arrow points at the top right data point saying "Scandinavian cuisine is a clear outlier". Visualization: C. Bรถrstell; Data: allrecipes.com via {tastyR} & TidyTuesday; Packages: {ggarrow, ggrepel, ggtext, tidyverse}
The ๐ or the ๐ณ? #TidyTuesday
Looking at the ingredients of over 2000 recipes online, where are different cuisines found on the chicken vs. egg (x-axis) and butter vs. oil (y-axis) scales?
As a Scandinavian, I guess I'm part of the egg+butter outlier!
Code: github.com/borstell/tid...
#R4DS
15.09.2025 12:52 โ ๐ 46 ๐ 16 ๐ฌ 3 ๐ 3
This week #TidyTuesday exploring a curated collection of recipes collected from Allrecipes website. I created donut chart subplots to look like plates and added a fork and knife image to each.
#pydytuesday #dataviz
16.09.2025 16:42 โ ๐ 18 ๐ 4 ๐ฌ 0 ๐ 0
The only fun name I've found is GitLab data team's google sheets loader called "sheetload" ๐
17.09.2025 15:42 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
New #rstats blog post: Deep Dive into ellmer: Part 2.
I explore the source code behind ellmer's tool calling functionality:
www.howardbaik.com/posts/deep-d...
Thanks to @hadley.nz, @grrrck.xyz, @atheriel.bsky.social, others for ๐ellmer๐!
16.09.2025 15:22 โ ๐ 13 ๐ 2 ๐ฌ 0 ๐ 0
SquidSim is the coolest package - it let's you build complex hierarchical data structures and then simulate data from the world you create. The best tool for doing proper power analyses and testing how well your models can uncover the 'truth'. I've been recommending it to everyone!
15.09.2025 17:08 โ ๐ 71 ๐ 18 ๐ฌ 0 ๐ 0
Had the pleasure of teaching a workshop on #Shiny today.
The {shiny} package can be used to make a web-app that uses R (or Python) under the hood. Ideal for interactive #dataViz
15.09.2025 17:37 โ ๐ 20 ๐ 5 ๐ฌ 2 ๐ 0
"I'm just a goat, standing in front of a contestant, asking them to choose me"
Monty Hall reminder: the only good reason to want the car is to sell it for more goats.
15.09.2025 07:34 โ ๐ 22 ๐ 7 ๐ฌ 1 ๐ 0
If all the world were a monorepo
The R ecosystem and the case for extreme empathy in software maintenance
Really insightful post from Julie Tibshirani (spotted in LinkedIn, can't find on Bsky) reflecting on #rstats 's unique governance structure and what can be learned for other languages
jtibs.substack.com/p/if-all-the...
14.09.2025 23:29 โ ๐ 126 ๐ 49 ๐ฌ 7 ๐ 8
output from a GAM in the linked essay
Simon Wood, the GOAT of generalized additive models & creator of the mgcv #rstats package, has an Annual Review of Statistics essay on GAMs, available open access #statssky #mlsky
www.annualreviews.org/content/jour...
10.09.2025 02:14 โ ๐ 89 ๐ 41 ๐ฌ 0 ๐ 1
The move patterns are VERY frustrating too
09.09.2025 05:45 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
A plot with a top panel with a histogram showing all penguin weights, with a bottom faceted panel with species-specific weight histograms
library(tidyverse)
library(ggtext)
library(patchwork)
library(scales)
top_plot <- ggplot(penguins, aes(x = body_mass)) +
geom_histogram(binwidth = 100, color = "white", boundary = 0) +
scale_x_continuous(
breaks = seq(2500, 6500, by = 1000),
limits = c(2500, 6500),
labels = label_comma()
) +
labs(title = "All penguins", x = NULL, y = "Count") +
theme_bw() +
theme(
plot.title = element_textbox_simple(
face = "bold",
fill = "grey75",
size = rel(0.85),
halign = 0,
linetype = 1,
linewidth = 0.2,
padding = margin(5, 5, 5, 5)
),
strip.background = element_rect(fill = "grey92"),
strip.text = element_text(hjust = 0),
axis.title.y = element_text(hjust = 1)
)
bottom_plot <- penguins |>
ggplot(aes(x = body_mass, fill = species)) +
geom_histogram(binwidth = 100, color = "white", boundary = 0) +
scale_x_continuous(
breaks = seq(2500, 6500, by = 1000),
limits = c(2500, 6500),
labels = label_comma()
) +
guides(fill = "none") +
facet_wrap(vars(species), ncol = 1) +
labs(x = "Body mass (g)", y = "Count", title = "Specific penguin species") +
theme_bw() +
theme(
plot.title = element_textbox_simple(
face = "bold",
fill = "grey75",
size = rel(0.85),
halign = 0,
linetype = 1,
linewidth = 0.2,
padding = margin(5, 5, 5, 5)
),
strip.background = element_rect(fill = "grey92"),
strip.text = element_text(hjust = 0),
axis.title.x = element_text(hjust = 0),
axis.title.y = element_text(hjust = 1)
)
(top_plot / bottom_plot) +
plot_layout(heights = c(0.25, 0.75))
The {ggh4x} package has neat support for nested facets for ggplot, but it wasn't quite working for a thing I was making, but I made a neat plot with fake nested facets with a combination of {ggtext} and {patchwork}! #rstats
(code here: gist.github.com/andrewheiss/...)
29.08.2025 14:02 โ ๐ 47 ๐ 5 ๐ฌ 0 ๐ 0
An abstract pattern of interlocking hexagons, with a central group of colorful and detailed hexagons featuring various logos, characters, and text.
New from Posit! The August Glimpse newsletter is here, featuring the new free IDE Positron, complete with LLM-powered tools Positron Assistant and Databot. Plus, updates to Quarto and Shiny for Python!
Check out the post here: posit.co/blog/posit-g...
#RStats #Python
28.08.2025 15:18 โ ๐ 22 ๐ 5 ๐ฌ 0 ๐ 0
Conversation: LLMs and Building Abstractions
How should we work with LLMs when growing abstractions?
NEW POST
Unmesh Joshi and I had an interesting email conversation about how when programming with an LLM he likes to grow a language of abstractions.
martinfowler.com/articles/con...
26.08.2025 13:34 โ ๐ 18 ๐ 4 ๐ฌ 0 ๐ 1
#statstab #405 Best Practices for Estimating, Interpreting, and
Presenting Nonlinear Interaction Effects
Thoughts: Guidance on nonlinear interactions, reporting (probabilities) and visualisations.
#probit #logit #logisticregression #nonlinear #guide
sociologicalscience.com/download/vol...
22.08.2025 19:20 โ ๐ 44 ๐ 10 ๐ฌ 4 ๐ 2
Models as Prediction Machines: How to Convert Confusing Coefficients into Clear Quantities
Abstract
Psychological researchers usually make sense of regression models by interpreting coefficient estimates directly. This works well enough for simple linear models, but is more challenging for more complex models with, for example, categorical variables, interactions, non-linearities, and hierarchical structures. Here, we introduce an alternative approach to making sense of statistical models. The central idea is to abstract away from the mechanics of estimation, and to treat models as โcounterfactual prediction machines,โ which are subsequently queried to estimate quantities and conduct tests that matter substantively. This workflow is model-agnostic; it can be applied in a consistent fashion to draw causal or descriptive inference from a wide range of models. We illustrate how to implement this workflow with the marginaleffects package, which supports over 100 different classes of models in R and Python, and present two worked examples. These examples show how the workflow can be applied across designs (e.g., observational study, randomized experiment) to answer different research questions (e.g., associations, causal effects, effect heterogeneity) while facing various challenges (e.g., controlling for confounders in a flexible manner, modelling ordinal outcomes, and interpreting non-linear models).
Figure illustrating model predictions. On the X-axis the predictor, annual gross income in Euro. On the Y-axis the outcome, predicted life satisfaction. A solid line marks the curve of predictions on which individual data points are marked as model-implied outcomes at incomes of interest. Comparing two such predictions gives us a comparison. We can also fit a tangent to the line of predictions, which illustrates the slope at any given point of the curve.
A figure illustrating various ways to include age as a predictor in a model. On the x-axis age (predictor), on the y-axis the outcome (model-implied importance of friends, including confidence intervals).
Illustrated are
1. age as a categorical predictor, resultings in the predictions bouncing around a lot with wide confidence intervals
2. age as a linear predictor, which forces a straight line through the data points that has a very tight confidence band and
3. age splines, which lies somewhere in between as it smoothly follows the data but has more uncertainty than the straight line.
Ever stared at a table of regression coefficients & wondered what you're doing with your life?
Very excited to share this gentle introduction to another way of making sense of statistical models (w @vincentab.bsky.social)
Preprint: doi.org/10.31234/osf...
Website: j-rohrer.github.io/marginal-psy...
25.08.2025 11:49 โ ๐ 942 ๐ 283 ๐ฌ 49 ๐ 19
my lil reproducibility talk from today / I really wanted to instill the PhD students some simple first practices and ways to step up your game from there github.com/tjmahr/2025-...
20.08.2025 23:52 โ ๐ 51 ๐ 13 ๐ฌ 3 ๐ 3
Trickster disciple. ๐ฆ Game developer. ๐ฒ๏ธโ Professional data generalist. ๐ Delightfully imperfect. โ๏ธ Not that kind of doctor. ๐ [E] Itsatrap ๐ฎ He/Him
not very committed to sparkle motion
Real stories of data work from the folks in the trenches.
Wednesday, 2025-09-24
https://databsconf.com/
header: Voyage Pro on Unsplash
avi: vectorsmarket15 on Flaticon
posts: @qethanm.bsky.social
A daily chart game ๐
Can you guess the country in red in 5 tries or less?
New chart, new topic, every day!
https://chartle.cc/
consultant ยท father ยท he/him ยท human (very) ยท husband ยท itinerant ยท programmer ยท keynote speaker ยท technologist ยท trainer ยท writer
Join us at Duke University, Durham, NC, USA (EDT) August 8 - 10th!
Software Engineer, Open Source @posit.co
I love posting about #rstats, texas politics, and power markets.
Independent AI researcher, creator of datasette.io and llm.datasette.io, building open source tools for data journalism, writing about a lot of stuff at https://simonwillison.net/
I write words good (journalist, author, radio person, etc) marie.s.leconte@gmail.com
research: AI, risk, complexity, finance history
Some posts in ๐ซ๐ท ๐ฉ๐ช ๐ท๐บ
- newsletter: https://complex-machinery.com
- fortunes: @fortuneexmachina.comโฌ
- blog: https://qethanm.cc
- Radar: https://www.oreilly.com/people/q-mccallum-2
HS science teacher turned data wrangler for edu nonprofit. Proud wife guy.
Professor at Wharton, studying AI and its implications for education, entrepreneurship, and work. Author of Co-Intelligence.
Book: https://a.co/d/bC2kSj1
Substack: https://www.oneusefulthing.org/
Web: https://mgmt.wharton.upenn.edu/profile/emollick
Locked in and posting regularly on here now
Senior Director, Data and Assessment at KIPP NYC, Adjunct at American Museum of Natural History. Passionate about STEM, data, education and students.
Software Engineer, Consultant & Author.
The Modern Software Engineering Channel: https://www.youtube.com/@ModernSoftwareEngineeringYT
Support Me On Patreon: https://bit.ly/ContinuousDeliveryPatreon
Software Developer, Technical Coach, YouTuber. She/her.
emilybache.com
apreshill.com
Product + Open Source + 1/3 of palmerpenguins team ๐ง
VP of Product @ Anaconda
Previously: RStudio / IBM / Voltron Data / OHSU
โข Director https://www.strategictranslation.org/
โข Essayist http://scholars-stage.org
โข Long takes on ๐จ๐ณ politics, ๐บ๐ธ conservatism, ancient history