This is, however, not clever or safe writing, it is a bad collective habit that needs to stop. Not by avoiding references to causality but by clear referencing to it
pubmed.ncbi.nlm.nih.gov/37286459/
@maartenvsmeden.bsky.social
statistician • associate prof • team lead health data science and head methods research program at julius center • director ai methods lab, umc utrecht, netherlands • views and opinions my own
This is, however, not clever or safe writing, it is a bad collective habit that needs to stop. Not by avoiding references to causality but by clear referencing to it
pubmed.ncbi.nlm.nih.gov/37286459/
The healthcare literature is filled with "risk factors". This word combination makes research findings sound important by implying causality, while avoiding direct claims of having identified causal associations that are easily critiqued.
31.07.2025 08:32 — 👍 24 🔁 1 💬 2 📌 2And taking this analogy one step further: it gives genuine phone repair shops a bad name
24.07.2025 08:26 — 👍 7 🔁 0 💬 0 📌 0When forced to make a choice, my choice will be logistic regression model over linear probability model 103% of the time
23.07.2025 20:43 — 👍 34 🔁 2 💬 0 📌 0Cover picture with blog title & subtitle, and results graph in the background
Post just up: Is multiple imputation making up information?
tldr: no.
Includes a cheeky simulation study to demonstrate the point.
open.substack.com/pub/tpmorris...
The leaky pipe of clinical prediction models. by @maartenvsmeden.bsky.social et al
You can have all the omni-omics data in the world and the bestest algorithms, but eventually a predicted probability is produced & it should be evaluated using well-established methods, and correctly implemented in the context of medical decision making.
statsepi.substack.com/i/140315566/...
Clients: “I want to find real, meaningful clusters”
Me: “I want world peace, which is more likely to happen than what you want”
Depending which methods guru you ask every analytical task is “essentially” a missing data problem, a causal inference problem, a Bayesian problem, a regression problem or a machine learning problem
10.07.2025 15:05 — 👍 59 🔁 6 💬 5 📌 3In medicine they are called "risk factors" and, of course, you want all "important" risk factors in your model all the time
Unless a risk factor is not statistically significant then you can drop that factor without issues
* New preprint led by Joao Matos & @gscollins.bsky.social
"Critical Appraisal of Fairness Metrics in Clinical Predictive AI"
- Important, rapidly growing area
- But confusion exists
- 62 fairness metrics identified so far
- Better standards & metrics needed for healthcare
arxiv.org/abs/2506.17035
Also, the fact that a model with the best AUC doesn't always mean the model makes the best predictions is lost in such cases too
27.06.2025 07:35 — 👍 2 🔁 0 💬 1 📌 0Surprisingly common thing: comparisons of prediction models developed using, say, Logistic Regression, Random Forest and XGBoost with conclusion XGBoost is "good" because it yields slightly higher AUC than LR or RF using the same data
Fact that "better" doesn't always mean "good" seems lost
Published: the paper 'On the uses and abuses of Regression Models: a Call for Reform of Statistical Practice and Teaching' by John Carlin and Margarita Moreno-Betancur in the latest issue of Statistics in Medicine onlinelibrary.wiley.com/doi/10.1002/... (1/8)
26.06.2025 12:23 — 👍 47 🔁 16 💬 3 📌 1What is common knowledge in your field, but shocks outsiders?
Validated does not mean it works as intended. It means someone has evaluated it (and may have concluded it doesn’t work at all)
**New Lancet DH paper**
"Importance of sample size on the quality & utility of AI-based prediction models for healthcare"
- for broad audience
- explains why inadequate SS harms #AI model training, evaluation & performance
- pushback to claims SS irrelevant to AI research
👇
tinyurl.com/yrje52fn
People always ask me, “how do I know my manuscript is done?”
There’s only one way, my friends.
If your file name looks something like this:
Manuscript - Final Draft 3.7 FINAL FINAL - FINAL (5).docx
Then, and only then, is it time.
Tempted
01.06.2025 10:06 — 👍 6 🔁 0 💬 1 📌 0Re-proposing the Occam's taser: an automatic electric shock for anyone riding the AI hype train making their models unnecessarily complex
27.05.2025 14:38 — 👍 12 🔁 2 💬 0 📌 0You just don't appreciate modern #dataviz
27.05.2025 14:32 — 👍 4 🔁 0 💬 0 📌 0Rule of thumb: If your model requires data to look like this (balanced after SMOTE), then maybe you want to use a different model
27.05.2025 13:43 — 👍 18 🔁 3 💬 4 📌 0We should really ban all use of AI from all our education because the use of AI will make our students dumb, and banning innovation has worked really well before
19.05.2025 14:31 — 👍 22 🔁 3 💬 3 📌 0As scientists it is difficult to stay in love with the questions once you fall in love with the answers
19.05.2025 09:14 — 👍 54 🔁 9 💬 0 📌 1Screenshot says ‘To a systematic-reviewer, the results of groups A to D may feel like turning up with a dustpan and brush after an earthquake because:’
So glad they let us use this turn-of-phrase
@maartenvsmeden.bsky.social
doi.org/10.1016/j.jc...
But what if being yourself is generating all your text using GPT?
12.05.2025 10:54 — 👍 6 🔁 0 💬 0 📌 0Wow, that *is* the exact wording of each of these paragraphs
12.05.2025 10:51 — 👍 10 🔁 0 💬 1 📌 0So.... about using large language models (e.g. chatGPT) for writing motivation letters for a job
I get it! And honestly, use all the technology you need to write the best letter you can
But after reading dozens of letters with almost EXACTLY the same intro paragraph I do get a bit tired of it
Only 5 days left to apply for our 3 PhD positions at the @umcutrecht.bsky.social 🎓 🏃🏃♀️
11.05.2025 06:37 — 👍 2 🔁 1 💬 0 📌 0Code or it did not happen
09.05.2025 14:19 — 👍 9 🔁 4 💬 0 📌 1‘Wow, this treatment was no better than placebo – because the placebo was so effective!’
It’s such a shame the term ‘placebo effect’ is so widely known and used. There are fundamental challenges with identifying it and most of the claims don’t even try. It’s just change-from-baseline every time.