Andrew Lampinen

Andrew Lampinen

@lampinen.bsky.social

Interested in cognition and artificial intelligence. Research Scientist at Google DeepMind. Previously cognitive science at Stanford. Posts are mine. lampinen.github.io

7,839 Followers 712 Following 315 Posts Joined Aug 2023
2 days ago
title section of the paper: “Cross-Modal Taxonomic Generalization in (Vision) Language Models” by Tianyang Xu, Marcelo Sandoval-Castañeda, Karen Livescu, Greg Shakhnarovich, Kanishka Misra.

What is the interplay between representations learned from (language) surface forms alone, and those learned from more grounded evidence (e.g.,vision)?

Excited to share new work understanding “Cross-modal taxonomic generalization” in (V)LMs

arxiv.org/abs/2603.07474

1/

32 12 1 0
5 days ago
Preview
The no-magic approach to understanding intelligent systems Today I want to write a bit about the philosophy I think underlies much of the work that my collaborators and I (as well as many other researchers that I respect) have done on understanding artificial...

Short post on what I call the "no-magic approach to understanding intelligent systems" — the philosophy I think of as motivating our work on understanding intelligence without resorting to magical thinking about AI or humans!
infinitefaculty.substack.com/p/the-no-mag...

32 5 1 1
6 days ago
Post image

Can large language models *introspect*?

In a new paper, @kmahowald.bsky.social and I study the MECHANISM of introspection in big open-source models.

tldr: Models detect internal anomalies through DIRECT ACCESS, but don't know what the anomalies are.

And they love to guess “apple” 🍎

70 16 2 6
1 week ago

Thank you! :)

1 0 1 0
1 week ago

But I'll forever be grateful for the privilege of being a part of DM through such an exciting time, for getting to work on many amazing projects, and for the wonderful collaborators and dear friends I've made along the way.

12 0 1 0
1 week ago

With all these changes, I've started to wonder if it would be easier to more effectively do the work that I think is most important and exciting somewhere else. After a short break, I'm excited to try something new (more to come soon, I hope).

16 0 1 0
1 week ago
View of London from a rooftop in Kings Cross

After 5.5 years (or 7 or 9, counting internships), today was my last day at Google/DeepMind. When I was in London recently, I walked through the two floors that were (most of) DeepMind when I first joined, and thought about how much the company and field have changed since then.

67 0 2 0
2 weeks ago
Post image

🚨New preprint! In-context learning underlies LLMs’ real-world utility, but what are its limits? Can LLMs learn completely novel representations in-context and flexibly deploy them to solve tasks? In other words, can LLMs construct an in-context world model? Let’s see! 👀

37 5 1 1
2 weeks ago

Really cool work — learning over sequential experiences that contain the embodied cue of viewpoint as well as visual inputs, can give rise to human-like 3D shape perception!

11 1 1 0
2 weeks ago
Preview
Dileep George joins Astera to lead its neuro-inspired AGI effort Dileep George is joining Astera as Head of AI, leading our AGI research division. Working alongside our Chief Scientist Doris Tsao, he and the team will explore novel, brain-inspired computational arc...

News! I've joined the Astera Institute to lead its neuroscience based AGI research. Backed by $1B+ commitment over the coming decade, my team will explore novel, brain-inspired architectures and algos toward safe, efficient human-like AGI, working alongside Doris Tsao. 1/

astera.org/dileep-georg...

83 6 13 2
2 weeks ago

That kind of structural generalization to entirely new situations seems hard to obtain from simpler models (without building the abstraction in a priori), even if it *is* ultimately a consequence of the same sort of simplicity-bias processes at play in the cases above. 3/3

1 0 0 0
2 weeks ago
Passive learning of active causal strategies in agents and language models

the type of generalization you can observe. E.g. if you train an LM-type (passive-learning) agent on causal tasks, you get emergent generalization to infer and exploit novel causal dependencies never seen in training (proceedings.neurips.cc/paper_files/...). 2/3

1 0 1 0
2 weeks ago
Passive learning of active causal strategies in agents and language models

Right, there are several things to distinguish here. Completely agree that benign-overfitting and simplicity-bias phenomena are not unique to DL models. But I think the fact that DL models can represent a much broader (and more abstract) solution class than simpler models qualitatively changes 1/3

1 0 1 0
2 weeks ago

perhaps it would depend on what exactly the generalization kernel is...

2 0 1 0
2 weeks ago

Yeah that's a great connection! Although I think that DL models transition smoothly to higher-order structural-type generalization (e.g., learning a truly novel task in context) that doesn't seem as obviously capturable through exemplar-based models as I think of them, though 1/2

3 0 1 0
2 weeks ago

Awesome, will check it out, thanks!

1 0 0 0
2 weeks ago

Yes, I'd agree with that statement! (And glad to hear it :) )

3 0 1 0
2 weeks ago
Preview
Latent learning: episodic memory complements parametric learning by enabling flexible reuse of experiences When do machine learning systems fail to generalize, and what mechanisms could improve their generalization? Here, we draw inspiration from cognitive science to argue that one weakness of parametric m...

show how actively augmenting data can improve certain kinds of generalization, and in the end of arxiv.org/abs/2509.16189 we suggest it might have something to do with how offline replay helps natural intelligences — more to come on that soon I hope :)

4 0 1 0
2 weeks ago
Preview
On the generalization of language models from in-context learning and finetuning: a controlled study Large language models exhibit exciting capabilities, yet can show surprisingly narrow generalization from finetuning. E.g. they can fail to generalize to simple reversals of relations they are trained...

Nevertheless, it seems like these models are not as sample efficient in the small-sample novel-task regime as humans; e.g., learning a new game from scratch. One reason may be that natural intelligences do more with each experience, including inferring beyond it; in arxiv.org/abs/2505.00661 we 2/3

2 0 1 0
2 weeks ago
Preview
Scaling Laws for Neural Language Models We study empirical scaling laws for language model performance on the cross-entropy loss. The loss scales as a power-law with model size, dataset size, and the amount of compute used for training, wit...

Hmm, well maybe we should discuss offline sometime :)
My quick answer would be that if anything, scale seems to *improve* generalization at the same time as reducing interference (e.g., see Fig. 2 in arxiv.org/abs/2001.08361 — larger models reduce test loss faster from the same amount of data). 1/2

2 0 1 0
2 weeks ago

How do you get it to build cumulatively on what it's learned, so that new learnings can build on older ones? I think ARC-AGI-3 tasks, for example, capture these kinds of challenges very well. I'm not saying interference is totally solved, TBC — just I think these other challenges are less solved 2/2

2 0 1 0
2 weeks ago

I'd say it's more a concern about learning efficiency, positive transfer (rather than negative, i.e. interference), and integration. How do you get a system to learn something efficiently and consolidate that in a way that makes necessary connections to prior knowledge? 1/2

4 0 1 0
2 weeks ago

Hmmm the link doesn't seem to work, what's the title?

0 0 1 0
2 weeks ago

Good point! I hadn't made that connection before

0 0 1 0
2 weeks ago

Hmm well I think images have quite a bit higher information density. Though you still definitely see memorization in diffusion models. But it's probably important that their output objective is reproduction, whereas humans are using visual inputs at a higher level of abstraction usually...

4 0 1 0
2 weeks ago
The Pitfalls of Simplicity Bias in Neural Networks

lags humans; lots more work to be done. Indeed, the simplicity biases I discuss here can sometimes be counterproductive, see e.g.: proceedings.neurips.cc/paper/2020/h...
But my main point here is that even for present systems, memorization doesn't necessarily prevent generalization. 2/2

2 0 1 0
2 weeks ago
Adversarially trained neural representations may already be as robust as corresponding biological neural representations Visual systems of primates are the gold standard of robust perception. There is thus a general belief that mimicking the neural representations that underlie those systems will yield artificial vis...

Definitely, though it's worth noting that natural systems don't necessarily always generalize perfectly either, e.g. proceedings.mlr.press/v162/guo22d.... — it's just much harder to optimize through them to directly find small attacks. But certainly there are ways that generalization of AI 1/2

2 1 1 0
2 weeks ago
Preview
Memory-based Parameter Adaptation Deep neural networks have excelled on a wide range of problems, from vision to language and game playing. Neural networks very gradually incorporate information into weights as they process data, requ...

2) not considering some fast recovery based on memory (arxiv.org/abs/1802.10542) which is almost certainly part of the natural intelligence solution to the problem. Not to say CI/CL are completely solved, but I think the actual problems are quite different than we used to imagine they were. 3/3

5 0 1 0
2 weeks ago
Effect of scale on catastrophic forgetting in neural networks Catastrophic forgetting presents a challenge in developing deep learning models capable of continual learning, i.e. learning tasks sequentially. Recently, both computer vision and natural-language...

I should write a post on this at some point too, but I actually think the catastrophic forgetting/continual learning area were led astray for a long time by 1) neglecting how much model scale reduces catastrophic interference (e.g. openreview.net/forum?id=GhV...) and 2/3

4 0 1 0
2 weeks ago

Hmmm maybe for humans, but for current AI it actually doesn't seem that you need to work all that hard to preserve memory for earlier information. I mean, it does definitely degrade to some extent, but the fact that there are even memorization concerns is precisely because a lot is preserved. 1/3

4 0 1 0