Wouter van Amsterdam vanamsterdam

work with

Diantha Schipaanboord, Floor B.H. van der Zalm, René van Es, Melle Vessies, Rutger R. van de Leur, Klaske R. Siegersma, Pim van der Harst, Hester M. den Ruijter, N. Charlotte Onland-Moret, on behalf of the IMPRESS consortium

04.09.2025 12:03 — 👍 0 🔁 0 💬 0 📌 0

ECG classification with convolutional neural networks demonstrates resilience to sex-imbalances in data Background: Many ECG-AI models have been developed to predict a wide range of cardiovascular outcomes. The underrepresentation of women in cardiovascular disease studies has raised concerns if these m...

Conclusion: The convolutional neural networks in this study demonstrated resilience to simulated sex-imbalance in training ECG data.

pre-print: doi.org/10.1101/2025...

04.09.2025 12:03 — 👍 2 🔁 0 💬 1 📌 0

Discrimination remained stable across sexes; only calibration shifted in extreme scenarios when prevalence differed by sex, with similar patterns for women and men.

04.09.2025 12:03 — 👍 0 🔁 0 💬 1 📌 0

Using ~165k ECGs, we simulated sex-imbalances in representation (women-to-men ratio), outcome prevalence, and misclassification in the training data for LBBB, long QT syndrome, LVH, and physician-labeled “abnormal” ECGs.

04.09.2025 12:03 — 👍 0 🔁 0 💬 1 📌 0

Pre-print alert:
Many ECG-AI models have been developed to predict a wide range of cardiovascular outcomes. But, underrepresentation of women in cardiovascular studies raises the question: Are ECG-AI models equally predictive for women and men with sex-imbalanced training data?

04.09.2025 12:03 — 👍 2 🔁 0 💬 1 📌 0

The Risks of Risk Assessment: Causal Blind Spots When Using Prediction Models for Treatment Decisions | Annals of Internal Medicine Clinicians increasingly rely on prediction models to guide treatment choices. Most prediction models, however, are developed using observational data that include some patients who have already receiv...

New paper in @annalsofim.bsky.social

"50 ways to misinterpret clinical prediction models for treatment decisions”

--> Published version: www.acpjournals.org/doi/10.7326/...

--> Open access version: arxiv.org/pdf/2402.17366

11.08.2025 14:32 — 👍 17 🔁 10 💬 2 📌 0

Hans van Houwelingen award ceremony and symposium June 19th 2025 - VVSOR This spring, the BMS-ANed organises an in-person meeting:

BMS-ANed Spring Meeting on Thursday, June 19
Time: 13:00–18:00 (CEST)
Location: Vredenburg 19, 3511 BB, Utrecht
Details and registration: vvsor.nl/biometrics/e...

06.06.2025 12:54 — 👍 1 🔁 1 💬 0 📌 0

Still some spots available in our summer school on all things causal inference, 7-11 July in Utrecht! Discounts for those working in universities and non-profits, and affordable accommodation offered by @utrechtuniversity.bsky.social summer school!

28.04.2025 08:00 — 👍 7 🔁 6 💬 1 📌 0

Even if you model a physical system, e.g. avg yearly temperature depending on height, and assume that temp given height is the same everywhere. If you invert it into predicting presence of mountain given temp, you’ll find varying discrimination in diff countries. Example from scholkopf’s talks

25.04.2025 14:58 — 👍 0 🔁 0 💬 0 📌 0

You’ve modeled a system with no meaningful variation across environments. The model may be reliable in the tested environments but you haven’t shown robustness against variation in distributions as you haven’t observed any

25.04.2025 14:56 — 👍 0 🔁 0 💬 2 📌 0

A causal viewpoint on prediction model performance under changes in case-mix: discrimination and calibration respond differently for prognosis and diagnosis predictions Prediction models need reliable predictive performance as they inform clinical decisions, aiding in diagnosis, prognosis, and treatment planning. The predictive performance of these models is typicall...

A question that remains is how these differences in environments may come about and what to do with this in practice? On this, I wrote a paper titled, available here: arxiv.org/abs/2409.01444

fin!

25.04.2025 11:13 — 👍 2 🔁 1 💬 0 📌 0

if the distribution of outcome given features remains the same (Y|X), calibration is preserved. If both are the same, the environments were not meaningfully different to begin with!

a more lengthy explanation is in this blog post: wvanamsterdam.com/posts/250425...

25.04.2025 11:13 — 👍 1 🔁 1 💬 2 📌 0

as promised (so all of you can breathe normally again), here's my TLDR answer:

Environments must differ with respect to something. If the distribution of features given outcome remains the same (X|Y), discrimination is preserved;

25.04.2025 11:13 — 👍 0 🔁 0 💬 1 📌 0

tagging some prediction modelers / statisticians, @maartenvsmeden.bsky.social @benvancalster.bsky.social @gelovennan.bsky.social @f2harrell.bsky.social @lucystats.bsky.social @miguelhernan.org @gscollins.bsky.social

(I will answer tomorrow)

24.04.2025 14:41 — 👍 0 🔁 0 💬 0 📌 0

Which is stronger evidence for robustness?

When evaluating predictive performance of one model in several different environments (e.g. regions / hospitals):

A. stable discrimination (AUC) and calibration in all environments
B. stable discrimination, varying calibration

vote with 👍=A; ❤️=B

24.04.2025 14:41 — 👍 1 🔁 1 💬 3 📌 0

ask chatGPT o3 this before submitting your next paper to, I got ~10 usable comments out of it:

you're a reviewer for <journal>; review the attached paper when you're either:

23.04.2025 15:11 — 👍 2 🔁 0 💬 0 📌 0

what are the exceptions?

11.04.2025 06:12 — 👍 0 🔁 0 💬 1 📌 0

Individual treatment effect estimation in the presence of unobserved confounding using proxies: a cohort study in stage III non-small cell lung cancer - Scientific Reports Scientific Reports - Individual treatment effect estimation in the presence of unobserved confounding using proxies: a cohort study in stage III non-small cell lung cancer

2. an external reproduction of the PROTECT method from Manchester University with Charlie Cuniffe, Matt Sperrin and Gareth Price (www.nature.com/articles/s41...)

3. a 'causal' meta-analysis method using only aggregate data, exciting work with Qingyang Shi from Groningen University

09.04.2025 06:28 — 👍 1 🔁 0 💬 0 📌 0

A causal viewpoint on prediction model performance under changes in case-mix: discrimination and calibration respond differently for prognosis and diagnosis predictions Prediction models inform important clinical decisions, aiding in diagnosis, prognosis, and treatment planning. The predictive performance of these models is typically assessed through discrimination a...

Very excited for my first (belated) visit to #EuroCIM2025!

I'm here with 3 bits of work:

1. a poster on a causal understanding of prediction model performance under shifts in 'case-mix' (or covariate / outcome drift); I show how discrimination and calibration respond differently
bit.ly/ccm-arxiv

09.04.2025 06:28 — 👍 10 🔁 2 💬 2 📌 0

An Overview of Large Language Models for Statisticians Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI), exhibiting remarkable capabilities across diverse tasks such as text generation, reasoning, and decis...

this seems pretty cool: an overview of llms for statisticians

arxiv.org/abs/2502.17814

03.03.2025 21:21 — 👍 4 🔁 0 💬 0 📌 0

Postdoc Biomedical Data Scientist / Biostatistician | LUMC In this postdoc position at LUMC, you will work on groundbreaking research that enhances the transparency and trustworthiness of decision support algorithms in healthcare. This position allows you to ...

Vacancy for a postdoc position.

Improve the transparency of decision support algorithms by figuring out how we can quantify and communicate uncertainty in individual causal predictions.

With Marleen Kunneman, Daniala Weir and me.
Three more days to apply 👇

www.lumc.nl/en/about-lum...

30.12.2024 14:56 — 👍 6 🔁 5 💬 0 📌 0

Building in the physics is one way to potentially get the right causal mechanisms

In sofar as the model is trained on real world patient data, you'll still have to ensure no biases e.g. related to confounding creep in

23.12.2024 08:01 — 👍 2 🔁 0 💬 1 📌 0

Digital twins are useful insofar as they reflect causal mechanisms

Don't think a generative model ('digital twin') can inform treatment decisions just because it procudes different outputs when you give it different inputs. Doesn't matter if it's 'AI' or not.

22.12.2024 19:59 — 👍 6 🔁 0 💬 1 📌 0

saliency maps are the new table 2 fallacy

17.12.2024 13:21 — 👍 0 🔁 0 💬 0 📌 0

Not sure about overfitting, results seemed robust to 5-site cross validation.

It just learns correlations, what's wrong with that? The words 'confounders' and 'bias' make it sound they expected the model to yield some causal understanding. Maybe these heatmaps are the new table 2 fallacy

16.12.2024 19:23 — 👍 2 🔁 0 💬 1 📌 0

Awesome, congrats!

16.12.2024 19:00 — 👍 1 🔁 0 💬 1 📌 0

Liking this interaction with @mmbronstein.bsky.social and Denis Danilov so much I'm reposting it here

06.12.2024 16:13 — 👍 38 🔁 4 💬 3 📌 0

Introduction to Causal Inference and Causal Data Science | Utrecht Summer School The course takes an interdisciplinary approach and is suitable for applied researchers across health, social and behavioural sciences.

Interested in how to use non-experimental data to answer causal research questions? Mystified by DAGs and counterfactuals? Want to learn what Target Trial Emulation is all about?

Sign up now for the 2nd edition of our summer school, 7-11 July in Utrecht, with @vanamsterdam.bsky.social & BPdeVries

04.12.2024 08:25 — 👍 54 🔁 14 💬 1 📌 3

Probably more like "the average of an infinite sequence of throws hits the bulls eye"

27.11.2024 14:22 — 👍 1 🔁 0 💬 1 📌 0

@oisinryan.bsky.social and I are developing a julia package for target trial emulation with a student, happy to be added to the list

17.11.2024 19:04 — 👍 3 🔁 0 💬 0 📌 0

Posts by Wouter van Amsterdam (@vanamsterdam.bsky.social)