The Dead Salmons of AI Interpretability
In a striking neuroscience study, the authors placed a dead salmon in an MRI scanner and showed it images of humans in social situations. Astonishingly, standard analyses of the time reported brain re...
We thus suggest that XAI researchers treat explanations as statistical estimators. Quantifying uncertainty and testing against random baselines will help align XAI with the standards of experimental science.
arxiv.org/abs/2512.18792 - with François Portet, Giada Dirupo and
@peyrardmax.bsky.social
08.01.2026 15:28 —
👍 2
🔁 0
💬 0
📌 0
This highlights a fundamental non-identifiability problem: multiple incompatible explanations can fit the same computation.
Other experimental sciences overcame similar issues in the past by adopting rigorous statistical frameworks.
2/3
08.01.2026 15:24 —
👍 1
🔁 0
💬 1
📌 0
Our new paper, "The Dead Salmons of AI interpretability", is out!
In 2009, researchers showed that standard statistical errors could detect "brain activity" in a dead salmon 🐟.
Modern XAI methods face similar issues: we find interpretable neurons and probes even in randomly initialized models.
1/X
08.01.2026 15:23 —
👍 2
🔁 0
💬 1
📌 0
I'm very happy to present our work "Everything, Everywhere, All at Once: Is Mechanistic Interpretability Identifiable?" this afternoon at #ICLR2025! Come have a chat at stand #439 :)
26.04.2025 02:26 —
👍 11
🔁 1
💬 0
📌 0