The Dead Salmons of AI Interpretability
In a striking neuroscience study, the authors placed a dead salmon in an MRI scanner and showed it images of humans in social situations. Astonishingly, standard analyses of the time reported brain re...
To take a step in this direction, we propose a statistical perspective of XAI, focusing on inference, uncertainty quantification and hypothesis testing. But it is only a start!
(with @maximemeloux.bsky.social , Giada Dirupo, FranΓ§ois Portet) @cnrs.fr @getalp.bsky.social @cnrsalpes.bsky.social
08.01.2026 10:57 β
π 1
π 0
π¬ 0
π 0
Psychology, econometrics, or neuroscience, have faced similar difficulties and reacted by adopting methodological reforms and rigorous statistical (causal) frameworks.
We believe, it is now our turn to build the methodological guardrails turning XAI into a pragmatic science.
08.01.2026 10:57 β
π 2
π 0
π¬ 1
π 0
This affects feature attribution, probing, SAEs, and even causal analyses
Taking a statistical view, we argue that most interpretability queries are non-identifiable: multiple incompatible explanations fit the same computation, leading to false positives and poor generalization
08.01.2026 10:57 β
π 0
π 0
π¬ 1
π 0
The Dead Salmons of AI Interpretability
In a striking neuroscience study, the authors placed a dead salmon in an MRI scanner and showed it images of humans in social situations. Astonishingly, standard analyses of the time reported brain re...
New preprint: The Dead Salmons of XAI
Standard fMRI pipelines once detected predictive brain regions in a dead salmon! A warning about poor statistical methodology
Now, XAI faces its own issues: many methods can yield plausible explanations even for randomized networks
arxiv.org/abs/2512.18792
08.01.2026 10:57 β
π 5
π 3
π¬ 1
π 0
I'm recruiting multiple PhD students for Fall 2026 in Computer Science at @hopkinsengineer.bsky.social π
Apply to work on AI for social sciences/human behavior, social NLP, and LLMs for real-world applied domains you're passionate about!
Learn more at kristinagligoric.com & help spread the word!
05.11.2025 14:56 β
π 29
π 17
π¬ 0
π 1
I'm very happy to present our work "Everything, Everywhere, All at Once: Is Mechanistic Interpretability Identifiable?" this afternoon at #ICLR2025! Come have a chat at stand #439 :)
26.04.2025 02:26 β
π 11
π 1
π¬ 0
π 0
What can be done?
π Stricter validity criteria?
π Maybe interpretability is inherently underdetermined? and we can only get control and predictability but not "understanding"
This is a fascinating topic, and we keep investigating. If you're interested, come and chat at ICLR!
21.04.2025 13:51 β
π 1
π 0
π¬ 0
π 0
We find a lot of identifiability issues:
- Multiple explanatory algorithms exists
- Even for one algorithm, there are many localizations in the network
Identifiability problems remain across scenarios: changing levels of over-parametrization, progress in training, multi-tasks, model size.
21.04.2025 13:51 β
π 1
π 0
π¬ 1
π 0
In our work, we stress-test the identifiability of research programs of MI with small MLPs and simple boolean logic tasks.
Why? It allows us to enumerate all possible explanations and see how many pass various MI testing criteria.
21.04.2025 13:51 β
π 0
π 0
π¬ 1
π 0
This brings us to identifiability. In statistics a property is identifiable if a unique value is compatible with the data. Identifiability matters because it is a prerequisite for doing statistical and causal inference.
Interpretability is also an exercise in causal inference!
21.04.2025 13:51 β
π 1
π 0
π¬ 1
π 0
Illustration of different strategies for mechanistic interpretability
Mechanistic Interpretability aims to produce statements like: "Model M solves task T by doing X."
To do so, many causal manipulations are performed to validate an explanation. But what if (many) other, incompatible explanations also pass the causal tests?
21.04.2025 13:51 β
π 0
π 0
π¬ 1
π 0
Abstract of the paper
Our paper "Everything, Everywhere, All at Once: Is Mechanistic Interpretability Identifiable?" will be presented at #ICLR2025!
It's also the first paper of my first PhD student, congrats @maximemeloux.bsky.social ! π
blog: melouxm.github.io/MI-identifia...
An explanatory thread π§΅:
21.04.2025 13:51 β
π 17
π 9
π¬ 1
π 0
An assembly of 18 European companies, labs, and universities have banded together to launch πͺπΊ EuroBERT!
It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc.
Details in π§΅
10.03.2025 09:43 β
π 80
π 20
π¬ 5
π 1
bc i haven't done so yet, i decided to burn any remaining bridge to the land of statistics. it wasn't statisticians nor statistics but it was me. i am simply not good enough to do statistics myself.
so, @peyrardmax.bsky.social and i decided to turn statistical estimation into supervised learning.
18.02.2025 18:12 β
π 30
π 9
π¬ 3
π 0
Check out our new paper on social determinants of on-campus food choice, now out in @pnasnexus.org!
academic.oup.com/pnasnexus/ar...
04.12.2024 15:30 β
π 12
π 3
π¬ 0
π 0
Hey, thanks for making it, can you also add me
24.11.2024 00:21 β
π 2
π 0
π¬ 0
π 0
I tried to find everyone who works in the area but I certainly missed some folks so please lmk...
go.bsky.app/BYkRryU
23.11.2024 05:11 β
π 53
π 18
π¬ 32
π 0
Thanks for creating the pack, I am also working on this topic :)
23.11.2024 16:59 β
π 1
π 0
π¬ 0
π 0