New preprint out 🎉
What happens to the hippocampal “place code” when an animal is actively engaged in a task?
The answer surprised us (and might surprise you too!).
Let's dive in ⬇️
Link:
"Hippocampal trace coding dominates and disrupts place coding" www.biorxiv.org/content/10.6...
🧪🧠 New preprint: helping resolve a decades-long debate in synaptic plasticity
NMDA receptors are central to Hebbian learning. Yet for >30 years, the existence and function of presynaptic NMDA receptors have remained controversial.
📄 doi.org/10.64898/202...
1/6
Someone added a `.stop_gradient()` and left it running. 😜
New paper out at PNAS: www.pnas.org/doi/10.1073/...
Revisiting the high-dimensional geometry of population responses in the visual cortex with @jpillowtime.bsky.social. The review took forever because a reviewer was doubtful our new estimator can infer eigenvalues beyond the rank of the data! (1/6)
Are you thinking about doing neuroscience outreach but want to make it more exciting or hands on?
Check out RetINaBox! (A collab led by the Trenholm lab)
We tried to bring the experience of experimental neuroscience to a classroom setting:
www.eneuro.org/content/13/1...
#neuroscience 🧪
Whoaaa!! This is a fantastic effort, and an amazing resource.
Huge congratulations to the authors! 🎉
Last day of poster sessions and presentations at
@neuripsconf.bsky.social. Full schedule featuring Mila-affiliated researchers presenting their work at #NeurIPS2025 here mila.quebec/en/news/foll...
In San Diego attending #NeurIPS2025?
Come to our poster to talk more about representation geometry in LLMs. 😃
🗓️ Friday 4:30-7:30 pm session
📍 Exhibit Hall C, D, E
🏁 Poster # 2502
(1/n) We are excited to share our new paper in Nature Communications, by Hagar Lavian (@hlavian.bsky.social) and team, revealing how the zebrafish brain integrates visual navigation signals! www.nature.com/articles/s41...
1/ Why does RL struggle with social dilemmas? How can we ensure that AI learns to cooperate rather than compete?
Introducing our new framework: MUPI (Embedded Universal Predictive Intelligence) which provides a theoretical basis for new cooperative solutions in RL.
Preprint🧵👇
(Paper link below.)
Population coding 🙌
How I contributed to rejecting one of my favorite papers of all times, Yes, I teach it to students daily, and refer to it in lots of papers. Sorry. open.substack.com/pub/kording/...
Thanks Ken! ☺️
Here's the (more updated) NeurIPS version: proceedings.neurips.cc/paper_files/...
Also, more recently we extended the use of powerlaws for characterizing how representations change over (pre/post) training in LLMs. 🙂
🧵 here: bsky.app/profile/arna...
This is an excellent blueprint on a very fascinating use of AI scientist! And the results and super cool and interesting! 🤩
I have been asked this when talking about our work on using powerlaws to study representation quality in deep neural networks, glad to have a more concrete answer now! 😃
Conrad Hal Waddington was born OTD in 1905.
His “epigenetic landscape” is a diagrammatic representation of the constraints influencing embryonic development.
On his 50th birthday, his colleagues gave him a pinball machine on the model of the epigenetic landscape.
🧪 🦫🦋 🌱🐋 #HistSTM #philsci #evobio
You mean the algorithms "generate" some auxilliary targets and then do supervised learning?
I got you 😉
I’m looking for interns to join our lab for a project on foundation models in neuroscience.
Funded by @ivado.bsky.social and in collaboration with the IVADO regroupement 1 (AI and Neuroscience: ivado.ca/en/regroupem...).
Interested? See the details in the comments. (1/3)
🧠🤖
A tad late (announcements coming) but very happy to share the latest developments in my previous preprint!
Previously, we show that neural representations for control of movement are largely distinct following supervised or reinforcement learning. The latter most closely matches NHP recordings.
Thank you! 😁
Indeed! We show in the paper that the DPO objective is analogous to contrastive learning objectives used for self-supervised vision pretraining, which is indeed entropy-seeking in nature (shown in prev works).
I feel spectral metrics can go a long way in unlocking LLM understanding+design. 🚀
A big shoutout to @koustuvsinha.com for insightful discussions that shaped this work, and
@natolambert.bsky.social + the OLMo team!
Paper 📝: arxiv.org/abs/2509.23024
👩💻 Code : Coming soon! 👨💻
This work was done with dream team 🤩
@melodylizx.bsky.social @kumarkagrawal.bsky.social Komal Teru @glajoie.bsky.social @adamsantoro.bsky.social @tyrellturing.bsky.social
at @mila-quebec.bsky.social @berkeleyair.bsky.social @cohere.com & @googleresearch.bsky.social!
🧵9/9
Takeaway: LLM training exhibits multi-phasic information geometry changes! ✨
- Pretraining: Compress → Expand (Memorize) → Compress (Generalize).
- Post-training: SFT/DPO → Expand; RLVR → Consolidate.
Representation geometry offers insights into when models memorize vs. generalize! 🤓
🧵8/9
BONUS: Is task-relevant info contained in the top eigendirections?
On SciQ:
- Removing top 10/50 directions barely hurts accuracy.✅
- Retaining only top 10/50 directions CRUSHES accuracy.📉
As supported by our theoretical results, eigenspectrum tail encodes critical task information! 🤯
🧵7/9
Why do these geometric phases arise?🤔
We show, both through theory and with simulations in a toy model, that these non-monotonic spectral changes occur due to gradient descent dynamics with cross-entropy loss under 2 conditions:
1. skewed token frequencies
2. representation bottlenecks
🧵6/9
Post-training also yields distinct geometric signatures:
- SFT & DPO exhibit entropy-seeking expansion, favoring instruction memorization but reducing OOD robustness.📈
- RLVR exhibits compression-seeking consolidation, learning reward-aligned behaviors at the cost of reduced exploration.📉
🧵5/9
How do these phases relate to LLM behavior?
- Entropy-seeking: Correlates with short-sequence memorization (♾️-gram alignment).
- Compression-seeking: Correlates with dramatic gains in long-context factual reasoning, e.g. TriviaQA.
Curious about ♾️-grams?
See: bsky.app/profile/liuj...
🧵4/9
LLMs have 3 pretraining phases:
Warmup: Rapid compression, collapsing representation to dominant directions.
Entropy-seeking: Manifold expansion, adding info in non-dominant directions.📈
Compression-seeking: Anisotropic consolidation, selectively packing more info in dominant directions.📉
🧵3/9
When investigating OLMo (@ai2.bsky.social) & Pythia (@eleutherai.bsky.social) model checkpoints, as expected, pretraining loss ⬇️monotonically.
BUT
🎢The spectral metrics (RankMe, αReQ) change non-monotonically (with more pretraining)!
Takeaway: We discover geometric phases of LLM learning!
🧵2/9