What do representations tell us about a system? Image of a mouse with a scope showing a vector of activity patterns, and a neural network with a vector of unit activity patterns
Common analyses of neural representations: Encoding models (relating activity to task features) drawing of an arrow from a trace saying [on_____on____] to a neuron and spike train. Comparing models via neural predictivity: comparing two neural networks by their R^2 to mouse brain activity. RSA: assessing brain-brain or model-brain correspondence using representational dissimilarity matrices
In neuroscience, we often try to understand systems by analyzing their representations β using tools like regression or RSA. But are these analyses biased towards discovering a subset of what a system represents? If you're interested in this question, check out our new commentary! Thread:
05.08.2025 14:36 β π 163 π 53 π¬ 5 π 0
Great new paper by @jessegeerts.bsky.social, looking at a certain type of generalization in transformers -- transitive inference -- and what conditions induce this type of generalization
06.06.2025 15:50 β π 1 π 0 π¬ 0 π 0
New paper: Generalization from context often outperforms generalization from finetuning.
And you might get the best of both worlds by spending extra compute and train time to augment finetuning.
02.05.2025 17:48 β π 4 π 0 π¬ 0 π 0
It was such a pleasure to co-supervise this research, but
@aaditya6284.bsky.social should really take the bulk of the credit :)
And thank you so much to all our wonderful collaborators, who made fundamental contributions as well!
Ted Moskovitz, Sara Dragutinovic, Felix Hill, @saxelab.bsky.social
11.03.2025 18:18 β π 2 π 0 π¬ 0 π 0
This paper is dedicated to our collaborator
Felix Hill, who passed away recently. This is our last ever paper with him.
It was bittersweet to finish this research, which contains so much of the scientific spark that he shared with us. Rest in peace Felix, and thank you so much for everything.
11.03.2025 18:18 β π 3 π 0 π¬ 1 π 0
Some general takeaways for interp:
11.03.2025 18:18 β π 2 π 0 π¬ 1 π 0
4. We provide intuition for these dynamics through a simple mathematical model.
11.03.2025 18:18 β π 2 π 0 π¬ 1 π 0
3. A lot of previous work (including our own), has emphasized *competition* between in-context and in-weights learning.
But we find that cIWL and ICL actually compete AND cooperate, via shared subcircuits. In fact, ICL cannot emerge if cIWL is blocked from emerging, even though ICL emerges first!
11.03.2025 18:18 β π 2 π 0 π¬ 1 π 0
2. At the end of training, ICL doesn't give way to in-weights learning (IWL), as we previously thought. Instead, the model prefers a surprising strategy that is a *combination* of the two!
We call this combo "cIWL" (context-constrained in-weights learning).
11.03.2025 18:18 β π 2 π 0 π¬ 1 π 0
1. We aimed to better understand the transience of in-context-learning (ICL) -- where ICL can emerge but then disappear after long training times.
11.03.2025 18:18 β π 2 π 0 π¬ 1 π 0
Dropping a few high-level takeaways in this thread.
For more details please see Aaditya's thread,
or the paper itself.
bsky.app/profile/aadi...
arxiv.org/abs/2503.05631
11.03.2025 18:18 β π 3 π 0 π¬ 1 π 0
New work led by
@aaditya6284.bsky.social
"Strategy coopetition explains the emergence and transience of in-context learning in transformers."
We find some surprising things!! E.g. that circuits can simultaneously compete AND cooperate ("coopetition") π― π§΅π
11.03.2025 18:18 β π 9 π 4 π¬ 1 π 0
The broader spectrum of in-context learning
The ability of language models to learn a task from a few examples in context has generated substantial interest. Here, we provide a perspective that situates this type of supervised few-shot learning...
What counts as in-context learning (ICL)? Typically, you might think of it as learning a task from a few examples. However, weβve just written a perspective (arxiv.org/abs/2412.03782) suggesting interpreting a much broader spectrum of behaviors as ICL! Quick summary thread: 1/7
10.12.2024 18:17 β π 123 π 31 π¬ 2 π 1
Introducing the :milkfoamo: emoji
09.12.2024 20:12 β π 1 π 1 π¬ 0 π 0
Hahaha. We need a cappuccino emoji?!
09.12.2024 16:56 β π 0 π 0 π¬ 1 π 0
I'll be not at Neurips this week. Let's grab coffee if you want to fomo-commiserate with me
09.12.2024 02:16 β π 12 π 0 π¬ 0 π 1
Hello hello. Testing testing 123
09.12.2024 00:48 β π 9 π 0 π¬ 3 π 0
AGI safety researcher at Google DeepMind, leading causalincentives.com
Personal website: tomeveritt.se
Cognitive neuroscientist and AI researcher
staff attorney @ Knight First Amendment Institute | he/him
Theoretical neuroscientist
Research fellow @ Kempner Institute, Harvard
dclark.io
Professor at the Gatsby Unit and Sainsbury Wellcome Centre, UCL, trying to figure out how we learn
Research director @Inria, Head of @flowersInria
lab, prev. @MSFTResearch @SonyCSLParis
Artificial intelligence, cognitive sciences, sciences of curiosity, language, self-organization, autotelic agents, education, AI and society
http://www.pyoudeyer.com
Professor at Wharton, studying AI and its implications for education, entrepreneurship, and work. Author of Co-Intelligence.
Book: https://a.co/d/bC2kSj1
Substack: https://www.oneusefulthing.org/
Web: https://mgmt.wharton.upenn.edu/profile/emollick
Google Chief Scientist, Gemini Lead. Opinions stated here are my own, not those of Google. Gemini, TensorFlow, MapReduce, Bigtable, Spanner, ML things, ...
Assistant Prof at University of Waterloo, CIFAR AI Chair at Vector Institute. Formerly UWNLP, Stanford NLP, MSR, FAIR, Google Brain, Salesforce Research via MetaMind
#machinelearning, #nlp
victorzhong.com
Philosophy of AI and Mind, but with a historical bent. Baruch College.
My dog is better than your dog.
https://www.jacob-browning.com/
Staff Research Scientist, People + AI Research (PAIR) team at Google DeepMind. Interpretability, analysis, and visualizations for LLMs. Opinions my own.
iftenney.github.io
Founder of Cognitive Resonance, a new venture dedicated to helping people understand human cognition and generative AI. Advocate for humans.
Personal Account
Founder: The Distributed AI Research Institute @dairinstitute.bsky.social.
Author: The View from Somewhere, a memoir & manifesto arguing for a technological future that serves our communities (to be published by One Signal / Atria
Asst Professor Psychology & Data Science @ NYU | Working on brains & climate, separately | Author of Models of the Mind: How physics, engineering, and mathematics have shaped our understanding of the brain https://shorturl.at/g23c5 | Personal account (duh)
PhD supervised by Tim RocktΓ€schel and Ed Grefenstette, part time at Cohere. Language and LLMs. Spent time at FAIR, Google, and NYU (with Brenden Lake). She/her.
PhD student at University of Alberta. Interested in reinforcement learning, imitation learning, machine learning theory, and robotics
https://chanb.github.io/
Princeton computer science prof. I write about the societal impact of AI, tech ethics, & social media platforms. https://www.cs.princeton.edu/~arvindn/
BOOK: AI Snake Oil. https://www.aisnakeoil.com/