What is the interplay between representations learned from (language) surface forms alone, and those learned from more grounded evidence (e.g.,vision)?
Excited to share new work understanding “Cross-modal taxonomic generalization” in (V)LMs
arxiv.org/abs/2603.07474
1/
Short post on what I call the "no-magic approach to understanding intelligent systems" — the philosophy I think of as motivating our work on understanding intelligence without resorting to magical thinking about AI or humans!
infinitefaculty.substack.com/p/the-no-mag...
Can large language models *introspect*?
In a new paper, @kmahowald.bsky.social and I study the MECHANISM of introspection in big open-source models.
tldr: Models detect internal anomalies through DIRECT ACCESS, but don't know what the anomalies are.
And they love to guess “apple” 🍎
Thank you! :)
But I'll forever be grateful for the privilege of being a part of DM through such an exciting time, for getting to work on many amazing projects, and for the wonderful collaborators and dear friends I've made along the way.
With all these changes, I've started to wonder if it would be easier to more effectively do the work that I think is most important and exciting somewhere else. After a short break, I'm excited to try something new (more to come soon, I hope).
After 5.5 years (or 7 or 9, counting internships), today was my last day at Google/DeepMind. When I was in London recently, I walked through the two floors that were (most of) DeepMind when I first joined, and thought about how much the company and field have changed since then.
🚨New preprint! In-context learning underlies LLMs’ real-world utility, but what are its limits? Can LLMs learn completely novel representations in-context and flexibly deploy them to solve tasks? In other words, can LLMs construct an in-context world model? Let’s see! 👀
Really cool work — learning over sequential experiences that contain the embodied cue of viewpoint as well as visual inputs, can give rise to human-like 3D shape perception!
News! I've joined the Astera Institute to lead its neuroscience based AGI research. Backed by $1B+ commitment over the coming decade, my team will explore novel, brain-inspired architectures and algos toward safe, efficient human-like AGI, working alongside Doris Tsao. 1/
astera.org/dileep-georg...
That kind of structural generalization to entirely new situations seems hard to obtain from simpler models (without building the abstraction in a priori), even if it *is* ultimately a consequence of the same sort of simplicity-bias processes at play in the cases above. 3/3
the type of generalization you can observe. E.g. if you train an LM-type (passive-learning) agent on causal tasks, you get emergent generalization to infer and exploit novel causal dependencies never seen in training (proceedings.neurips.cc/paper_files/...). 2/3
Right, there are several things to distinguish here. Completely agree that benign-overfitting and simplicity-bias phenomena are not unique to DL models. But I think the fact that DL models can represent a much broader (and more abstract) solution class than simpler models qualitatively changes 1/3
perhaps it would depend on what exactly the generalization kernel is...
Yeah that's a great connection! Although I think that DL models transition smoothly to higher-order structural-type generalization (e.g., learning a truly novel task in context) that doesn't seem as obviously capturable through exemplar-based models as I think of them, though 1/2
Awesome, will check it out, thanks!
Yes, I'd agree with that statement! (And glad to hear it :) )
show how actively augmenting data can improve certain kinds of generalization, and in the end of arxiv.org/abs/2509.16189 we suggest it might have something to do with how offline replay helps natural intelligences — more to come on that soon I hope :)
Nevertheless, it seems like these models are not as sample efficient in the small-sample novel-task regime as humans; e.g., learning a new game from scratch. One reason may be that natural intelligences do more with each experience, including inferring beyond it; in arxiv.org/abs/2505.00661 we 2/3
Hmm, well maybe we should discuss offline sometime :)
My quick answer would be that if anything, scale seems to *improve* generalization at the same time as reducing interference (e.g., see Fig. 2 in arxiv.org/abs/2001.08361 — larger models reduce test loss faster from the same amount of data). 1/2
How do you get it to build cumulatively on what it's learned, so that new learnings can build on older ones? I think ARC-AGI-3 tasks, for example, capture these kinds of challenges very well. I'm not saying interference is totally solved, TBC — just I think these other challenges are less solved 2/2
I'd say it's more a concern about learning efficiency, positive transfer (rather than negative, i.e. interference), and integration. How do you get a system to learn something efficiently and consolidate that in a way that makes necessary connections to prior knowledge? 1/2
Hmmm the link doesn't seem to work, what's the title?
Good point! I hadn't made that connection before
Hmm well I think images have quite a bit higher information density. Though you still definitely see memorization in diffusion models. But it's probably important that their output objective is reproduction, whereas humans are using visual inputs at a higher level of abstraction usually...
lags humans; lots more work to be done. Indeed, the simplicity biases I discuss here can sometimes be counterproductive, see e.g.: proceedings.neurips.cc/paper/2020/h...
But my main point here is that even for present systems, memorization doesn't necessarily prevent generalization. 2/2
Definitely, though it's worth noting that natural systems don't necessarily always generalize perfectly either, e.g. proceedings.mlr.press/v162/guo22d.... — it's just much harder to optimize through them to directly find small attacks. But certainly there are ways that generalization of AI 1/2
2) not considering some fast recovery based on memory (arxiv.org/abs/1802.10542) which is almost certainly part of the natural intelligence solution to the problem. Not to say CI/CL are completely solved, but I think the actual problems are quite different than we used to imagine they were. 3/3
I should write a post on this at some point too, but I actually think the catastrophic forgetting/continual learning area were led astray for a long time by 1) neglecting how much model scale reduces catastrophic interference (e.g. openreview.net/forum?id=GhV...) and 2/3
Hmmm maybe for humans, but for current AI it actually doesn't seem that you need to work all that hard to preserve memory for earlier information. I mean, it does definitely degrade to some extent, but the fact that there are even memorization concerns is precisely because a lot is preserved. 1/3