How likely is ‘almost certainly’? The scourge of weasel words
Phrases to describe probability are getting lost in translation — and have helped to cause at least one military catastrophe
I’m in The Times today talking about how we judge probability-based language and what happens when words mean different things to different people.
This follows an online quiz I’ve been running at probability.kucharski.io over the past few weeks, with 5000+ participants and counting.
24.02.2026 08:27 —
👍 33
🔁 10
💬 2
📌 0
Every American needs to watch this:
05.02.2026 20:41 —
👍 20527
🔁 10285
💬 580
📌 982
Adri\'an Detavernier, Jasper De Bock: Robustness quantification and how it allows for reliable classification, even in the presence of distribution shift and for small training sets https://arxiv.org/abs/2503.22418 https://arxiv.org/pdf/2503.22418 https://arxiv.org/html/2503.22418
31.03.2025 06:05 —
👍 0
🔁 2
💬 1
📌 1
Hack the planet!
16.09.2025 11:06 —
👍 2
🔁 3
💬 0
📌 0
No, you did not give those of us who happened to look like the people who bombed Pearl Harbor any due process. And that was profoundly wrong. It destroyed our lives.
08.08.2025 18:46 —
👍 13968
🔁 3955
💬 604
📌 211
Print screen of the first page of a paper pre-print titled "Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor" by Olteanu et al. Paper abstract: "In AI research and practice, rigor remains largely understood in terms of methodological rigor -- such as whether mathematical, statistical, or computational methods are correctly applied. We argue that this narrow conception of rigor has contributed to the concerns raised by the responsible AI community, including overblown claims about AI capabilities. Our position is that a broader conception of what rigorous AI research and practice should entail is needed. We believe such a conception -- in addition to a more expansive understanding of (1) methodological rigor -- should include aspects related to (2) what background knowledge informs what to work on (epistemic rigor); (3) how disciplinary, community, or personal norms, standards, or beliefs influence the work (normative rigor); (4) how clearly articulated the theoretical constructs under use are (conceptual rigor); (5) what is reported and how (reporting rigor); and (6) how well-supported the inferences from existing evidence are (interpretative rigor). In doing so, we also aim to provide useful language and a framework for much-needed dialogue about the AI community's work by researchers, policymakers, journalists, and other stakeholders."
We have to talk about rigor in AI work and what it should entail. The reality is that impoverished notions of rigor do not only lead to some one-off undesirable outcomes but can have a deeply formative impact on the scientific integrity and quality of both AI research and practice 1/
18.06.2025 11:48 —
👍 63
🔁 18
💬 2
📌 3
We despise immigrants for not putting down roots, even as we make sure that it is impossible for them to do so. We do this because we have no idea what we want.
open.substack.com/pub/iandunt/...
16.05.2025 09:50 —
👍 2151
🔁 553
💬 103
📌 39
Why Tot Celebrity Ms. Rachel Waded Into the Gaza Debate
I'm embarrassed for the New York Times that they published this piece on Ms. Rachel, in which they cite a ridiculous anonymous righwing website Stopantisemitism while indulging the mad, mad claim she may be funded by Hamas (!).
This isn't journalism:
15.05.2025 16:07 —
👍 3193
🔁 503
💬 132
📌 88
Just out! Our peer-reviewed critique of the Cass Review has been published by BMC Medical Research Methodology. Please read and share. We show that the Cass Review is fatally flawed and should not be the basis for policy or practice in transgender healthcare.
link.springer.com/article/10.1...
10.05.2025 12:31 —
👍 5645
🔁 2869
💬 129
📌 225
Aleatoric and epistemic uncertainty are clear-cut concepts, right? ... right? 😵💫 In our new ICLR blogpost we let different schools of thought speak and contradict each other, and revisit chatbots where “the character of aleatory ‘transforms’ into epistemic” iclr-blogposts.github.io/2025/blog/re...
08.05.2025 08:18 —
👍 31
🔁 9
💬 1
📌 0
@bagleycartoons.bsky.social
06.05.2025 22:38 —
👍 141
🔁 59
💬 1
📌 2
Even accepting the premise that AI produces useful writing (which no one should), using AI in education is like using a forklift at the gym. The weights do not actually need to be moved from place to place. That is not the work. The work is what happens within you.
15.04.2025 02:56 —
👍 10497
🔁 3371
💬 104
📌 270
A tweet by Sarah Longwell (@SarahLongwell25
) reads: "He’s threatening media companies who are critical of him. He’s talking about sending Americans to foreign prisons. He’s signing executive orders to investigate former staff members who spoke out against him. Don’t you see what’s happening here?"
I see it. I have lived it. 83 years ago, the U.S. government turned upon a group of its own citizens and residents and sent them to internment camps without due process. I was there among them. American fascism is back. It is here. It is now.
15.04.2025 20:30 —
👍 45320
🔁 14447
💬 957
📌 481
Community for Rigor
Reliable research can be complicated to create. So we made a network of essential resources to help you better understand the principles and practices of scientific rigor.Why trust us? Because we’re a...
So I am leading this group building great teaching materials for scientific rigor (c4r.io). Their first unit is really coming together and I will teach it (Monday, April 21, 2025, 12:00 -1:00pm EST) to see how well it works. Join us: forms.monday.com/forms/7d978e...
08.04.2025 23:04 —
👍 20
🔁 9
💬 0
📌 0
I've really enjoyed reading this "workography" by Kees van Deemter, whom I've never met but who has had a long career in NLP. Lots of storytelling and reflections on research, moving between institutions and countries, finding mentors, choosing between academia and industry, and more.
09.04.2025 09:34 —
👍 19
🔁 3
💬 0
📌 0
Calibrating Expressions of Certainty
ArXiv link for Calibrating Expressions of Certainty
This study introduces a method for calibrating certainty expressions, transforming phrases like "Maybe" into probability distributions. This enhances decision-making for radiologists and fine-tunes AI models, improving uncertainty communication. https://arxiv.org/abs/2410.04315
03.04.2025 20:20 —
👍 3
🔁 1
💬 0
📌 1
How to Leverage Predictive Uncertainty Estimates for Reducing Catastrophic Forgetting in Online C...
Giuseppe Serra, Ben Werner, Florian Buettner
Action editor: Emmanuel Bengio
https://openreview.net/forum?id=dczXe0S1oL
#forgetting #memory #forget
02.04.2025 00:07 —
👍 3
🔁 1
💬 0
📌 0
March 31st is Trans Day of Visibility.
31.03.2025 21:07 —
👍 714
🔁 221
💬 7
📌 7
Enjoying this game very much!
26.03.2025 23:38 —
👍 1
🔁 0
💬 0
📌 0
While reading Ben Recht's article, I found Foster & Hart (2021) (arxiv.org/abs/2210.07169) quite interesting. The contribution is a proposal of always-calibrated forecaster based on a continuously-relaxed calibration measure. But I actually love their §1.1 motivating calibration.
22.03.2025 00:03 —
👍 5
🔁 1
💬 0
📌 0
LinkedIn
This link will take you to a page that’s not on LinkedIn
📣 New paper! The field of AI research is increasingly realising that benchmarks are very limited in what they can tell us about AI system performance and safety. We argue and lay out a roadmap toward a *science of AI evaluation*: arxiv.org/abs/2503.05336 🧵
20.03.2025 13:28 —
👍 38
🔁 12
💬 1
📌 1
Germany Tried to Silence Me, a UN Official, for Talking About Israel’s Genocidal War in Gaza
Francesca Albanese on her five-day trip that exposed Germany's harsh deviation from democratic values and shrinking landscape for freedom of expression.
"Germany Tried to Silence Me, a UN Official, for Talking About Israel’s Genocidal War in Gaza"
In an exclusive piece for Zeteo, UN Special Rapporteur Francesca Albanese writes about her 5-day trip that exposed Germany's harsh deviation from democratic values:
19.03.2025 17:56 —
👍 1448
🔁 420
💬 36
📌 15
Docs is an open source collaborative text editor created by a joint effort from the French 🇫🇷 and German 🇩🇪governments.
github.com/suitenumeriq...
17.03.2025 08:03 —
👍 26
🔁 1
💬 0
📌 0