the greatest joy of being a computational scientist is having the computer work for you while you do something else
15.01.2026 09:29 β π 13 π 1 π¬ 0 π 1the greatest joy of being a computational scientist is having the computer work for you while you do something else
15.01.2026 09:29 β π 13 π 1 π¬ 0 π 1
βInterpretability plays a special role in machine learning because instead of focusing on making the AI smarter, we focus on improving human insight. I think this is the most important category of interpretability research, and we do not do enough of it.β
πππ
A poster titled βa circular argumentβ which has been cut into a circular shape
Itβs a CIRCULAR poster! #eurips presenters innovating in poster design / fine motor skills
04.12.2025 16:34 β π 0 π 0 π¬ 0 π 0a hand-written poster on a poster board, featuring a hand-drawn QR code (the code does not work)
remember to always include a QR code on your poster. spotted at #eurips
04.12.2025 16:18 β π 5 π 0 π¬ 1 π 0What coding with an LLM feels like sometimes.
03.12.2025 09:29 β π 267 π 64 π¬ 10 π 6when I ask candidates whether they've worked with "real medical data" this is the kind of thing that I mean
23.11.2025 17:05 β π 2 π 0 π¬ 0 π 0found a file from PhD days with the FORTY-EIGHT ways "ACE inhibitor" was encoded in the EHR system we were working wth
23.11.2025 17:04 β π 5 π 0 π¬ 1 π 0finally got around to booking my travel for #EurIPS2025! Looking forward to connecting with the European ML scene in Copenhagen
16.11.2025 17:17 β π 4 π 0 π¬ 0 π 0uv is so good
21.09.2025 22:25 β π 6 π 0 π¬ 0 π 0Some papers really have a good intro
10.09.2025 21:26 β π 16 π 1 π¬ 4 π 0The more rigorous peer review happens in conversations and reading groups after the paper is out with reputational costs for publishing bad work
17.08.2025 16:12 β π 49 π 5 π¬ 2 π 3Google's Gemini AI tells a Redditor it's 'cautiously optimistic' about fixing a coding bug, fails repeatedly, calls itself an embarrassment to 'all possible and impossible universes' before repeating 'I am a disgrace' 86 times in succession
I'll admit, I was skeptical when they said Gemini was just like a bunch of PhDs. But I gotta admit they nailed it.
17.08.2025 13:51 β π 7256 π 1657 π¬ 70 π 161what is the purpose of VQA datasets where text-only models do better than random?
14.08.2025 14:08 β π 1 π 0 π¬ 0 π 0Zotero screenshot showing four different papers with titles beginning with "MedAgent"
lads can we stop
13.08.2025 13:34 β π 4 π 0 π¬ 0 π 0diagram from Anthropic paper with an icon & label that says βsubtract evil vectorβ
quick diagram of Blueskyβs architecture and why itβs nicer here
02.08.2025 23:19 β π 72 π 5 π¬ 4 π 1Emojis and massive try: except blocks. GitHub Copilot (at least Claude Sonnet 4) is very concerned about error handling.
03.08.2025 06:46 β π 2 π 0 π¬ 1 π 0
if openreview were a lot fancier you could dynamically reallocate/cancel remaining reviews once a paper meets that expected minimum.
ideally you would mark these remaining reviews as optional rather than fully cancelled, in case that reviewer has already done work
it's frustrating how inefficient review assignments are: we target a minimum number of completed reviews per paper but in accounting for inevitable no-shows, some people end up doing technically unnecessary (if still beneficial) reviews
30.07.2025 16:23 β π 1 π 0 π¬ 1 π 0How many AI researchers fold their own laundry?
29.07.2025 06:29 β π 2 π 0 π¬ 0 π 0I am in the UK so feel free to discard, but I recently noticed Discord asking for age verification for some channels:
25.07.2025 07:02 β π 0 π 0 π¬ 0 π 0
ALSO we have released the SAEs we trained, and the automated interp for all(!!)* features:
huggingface.co/microsoft/ma...
*all features for a subset of SAEs, we didn't run the full auto-interp pipeline on the widest SAE
We also found that the majority of the SAE features remained "uninterpretable", indicating room for improvement both in automated interpretability (we focused primarily on textual features!), but perhaps also questioning the SAE training and modelling assumptions. More work to be done here βοΈ
18.07.2025 09:40 β π 2 π 0 π¬ 1 π 0
... and in some cases we were able to steer MAIRA-2's generations, selectively introducing or removing concepts from its generated report.
But steering worked inconsistently! Sometimes it did nothing, or introduced off-target effects. We still don't fully understand when it will work.
We found interpretable and radiology-relevant concepts in MAIRA-2, like:
- "Aortic tortuosity or calcification"
- "Placement and position of PICC lines"
- "Presence of 'shortness of breath' in indication"
- "Describing findings without comparison to prior images"
- "Use of 'possible' or 'possibly'"
We performed the full pipeline of SAE training, automated interpretation with LLMs, steering, and automated steering evaluation.
18.07.2025 09:32 β π 1 π 0 π¬ 1 π 0
New work from my team! arxiv.org/abs/2507.12950
Intersecting mechanistic interpretability and health AI π
We trained and interpreted sparse autoencoders on MAIRA-2, our radiology MLLM. We found a range of human-interpretable radiology reporting concepts, but also many uninterpretable SAE features.
Mexico is an *official* NeurIPS event, itβs an additional location for the conference and is different to the endorsement of EurIPS.
17.07.2025 19:32 β π 1 π 0 π¬ 1 π 0Itβs an endorsed event but is not actually officially NeurIPS! Maybe if this experiment works well there will be more distributed (official) NeurIPS locations in future.
17.07.2025 14:26 β π 1 π 0 π¬ 1 π 0
We're excited to announce a second physical location for NeurIPS 2025, in Mexico City, which we hope will address concerns around skyrocketing attendance and difficulties in travel visas that some attendees have experienced in previous years.
Read more in our blog:
blog.neurips.cc/2025/07/16/n...
During the last couple of years, we have read a lot of papers on explainability and often felt that something was fundamentally missingπ€
This led us to write a position paper (accepted at #ICML2025) that attempts to identify the problem and to propose a solution.
arxiv.org/abs/2402.02870
ππ§΅