Jesse Geerts's Avatar

Jesse Geerts

@jessegeerts.bsky.social

Cognitive neuroscientist and AI researcher

184 Followers  |  138 Following  |  40 Posts  |  Joined: 06.06.2025
Posts Following

Posts by Jesse Geerts (@jessegeerts.bsky.social)

🚨🚨New Preprint Alert!🚨🚨

www.biorxiv.org/content/10.6...

Animal learning is painfully slow (at least initially). Yet, well trained animals can learn very fast, sometimes displaying few-shot inference. How does this transition occur?

21.02.2026 17:51 β€” πŸ‘ 58    πŸ” 21    πŸ’¬ 1    πŸ“Œ 1

Thrilled to finally share this work! πŸ§ πŸ”Š

Using a new reinforcement-free task we show mice (like humans) extract abstract structure from sound (unsupervised) & dCA1 is causally required by building factorised, orthogonal subspaces of abstract rules.

Led by Dammy Onih!
www.biorxiv.org/content/10.6...

16.02.2026 13:01 β€” πŸ‘ 150    πŸ” 52    πŸ’¬ 3    πŸ“Œ 2

Code for our multi-region motor learning model is now available on GitHub!

github.com/jessegeerts/...

09.02.2026 09:13 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - jessegeerts/action-embedding: Action embeddings for RL - model of motor adaptation and generalization Action embeddings for RL - model of motor adaptation and generalization - jessegeerts/action-embedding

Code to run this model and reproduce figures is now public: github.com/jessegeerts/...

09.02.2026 09:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Updated work from @jessegeerts.bsky.social extending his results on transitive inference in transformers (including LLMs!)

updated paper: arxiv.org/abs/2506.04289
bleeprint (what are we calling these?) below ⬇️

04.02.2026 18:34 β€” πŸ‘ 17    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Preview
Relational reasoning and inductive bias in transformers and large language models Transformer-based models have demonstrated remarkable reasoning abilities, but the mechanisms underlying relational reasoning remain poorly understood. We investigate how transformers perform \textit{...

Updated paper: arxiv.org/abs/2506.04289. Joint work @ndrewliu.bsky.social, @scychan.bsky.social, @clopathlab.bsky.social, and @neurokim.bsky.social

04.02.2026 15:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

This parallels our small transformer findings: when models must reason from context, representational geometry determines success or failure at transitive inference.

04.02.2026 15:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

This effect was strongest when models couldn't fall back on stored knowledge (incongruent/permuted items). For congruent items where weight-stored knowledge helps, the geometric scaffold barely mattered.

04.02.2026 15:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Across Gemini, Gemma, and GPT models, linear consistently led to higher accuracy on transitive inference prompts.

04.02.2026 15:08 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We then prompted LLMs with different geometric scaffolds: "imagine these items on a number line" (linear) vs "on a circle" (circular). Circular orderings violate transitivity because relationships can wrap around (A>B>C>A).

04.02.2026 15:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

We used the ReCogLab dataset (github.com/google-deepm...) to test transitive inference with items that are congruent with world knowledge (whale > dolphin > goldfish), incongruent (goldfish > dolphin > whale), or random. This lets us tease apart reasoning from context vs relying on stored knowledge.

04.02.2026 15:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Quick recap: how a transformer is pre-trained determines whether it can do transitive inference (A>B, B>C β†’ A>C).

In-weights learning β†’ yes.
ICL trained on copying β†’ no.
ICL pre-trained on linear regression β†’ yes.

But these are small-scale toy models. What about in LLMs?

04.02.2026 15:05 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Update on this work! We've extended our transitive inference study to large language models 🧡

04.02.2026 15:04 β€” πŸ‘ 10    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1
Post image

I’m excited to share my first PhD preprint!πŸŽ‰
We studied how interactions between medial entorhinal cortex (MEC) and hippocampus shape theta sequences during navigation, and asked whether some β€œplanning-like” patterns in hippocampus could arise from upstream MEC dynamics. (1/8)

16.01.2026 20:22 β€” πŸ‘ 29    πŸ” 8    πŸ’¬ 1    πŸ“Œ 0

With some trepidation, I'm putting this out into the world:
gershmanlab.com/textbook.html
It's a textbook called Computational Foundations of Cognitive Neuroscience, which I wrote for my class.

My hope is that this will be a living document, continuously improved as I get feedback.

09.01.2026 01:27 β€” πŸ‘ 584    πŸ” 237    πŸ’¬ 16    πŸ“Œ 10

Just to add one thing to this discussion: in our paper, the "supervised" network predicts the action, which is internally generated by the actor, which is why we assume the agent has access to it. We toyed with calling this self-supervised but didn't want to cause confusion with other SS work

08.01.2026 11:03 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Thanks for sharing that paper! I was unaware of this but it's a cool result

07.01.2026 16:34 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

New paper led by wonder postdocs Francesca Greenstreet and @jessegeerts.bsky.social and @clopathlab.bsky.social trying to understand why –in the "what for" sense– there are multiple motor learning systems –supervised and RL-based– in the brain.

Check out Jesse's 🧡

www.biorxiv.org/content/10.6...

06.01.2026 13:43 β€” πŸ‘ 32    πŸ” 7    πŸ’¬ 1    πŸ“Œ 0

Check out our new work on motor learning across multiple brain regions!

05.01.2026 17:00 β€” πŸ‘ 20    πŸ” 6    πŸ’¬ 0    πŸ“Œ 0

Thank you! Feel free to get in touch with comments or questions

05.01.2026 15:04 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Many thanks to first author Francesca Greenstreet (equal contribution), and to @juangallego.bsky.social and @clopathlab.bsky.social!

05.01.2026 13:05 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Why motor learning involves multiple systems: an algorithmic perspective The initial stage of learning motor skills involves exploring vast action spaces, making it impractical to learn the value of every possible action independently. This poses a challenge for standard r...

The key insight: supervised learning in ctx/cerebellum doesn't just predict actions - it builds a structured space that makes RL in basal ganglia faster and enables generalization between similar movements. We make several predictions in the paper: www.biorxiv.org/content/10.6...

(8/9)

05.01.2026 13:04 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

3. Limits on dual adaptation (shown by Woolley et al 2007) also emerge: learning opposite rotations for nearby targets fails because their policies overlap in embedding space. Distant targets adapt independently. (7/9)

05.01.2026 13:02 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

2. The model also captures classic behavioural findings such as fast visuomotor adaptation (e.g. Krakauer et al. 2000). In our model, this emerges from retraining only the linear decoder. The characteristic generalization profile falls out without additional assumptions. (6/9)

05.01.2026 13:02 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

1. Recent work shows similar striatal activity for similar reaches (Park et al. 2025), while classic work shows distinct activity for distinct choices. Our model captures both: if basal ganglia learn policies in a structured embedding, policy similarity scales with action similarity. (5/9)

05.01.2026 13:00 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Our model combines two learning systems: a supervised encoder-decoder (ctx/cerebellum) learns embeddings where similar movements cluster together, by predicting actions via a bottleneck. An actor-critic network (basal ganglia) learns policies directly in this low-dimensional embedding space. (4/9)

05.01.2026 12:59 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

We were inspired by recent machine learning approaches which learn structured action representations for tasks like robotics, where action spaces are vast πŸ€– (3/9)

05.01.2026 12:57 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The problem: learning motor skills means exploring vast action spaces (think: every muscle combination). Standard RL models treat each action independently, which is slow and scales badly with the size of the action space (2/9)

05.01.2026 12:57 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Post image

🧠 New year, new preprint!

Why does motor learning involve multiple brain regions? We propose that the cortico-cerebellar system learns a "map" of actions where similar movements are nearby, while basal ganglia do RL in this simplified space.

www.biorxiv.org/content/10.6...

05.01.2026 12:54 β€” πŸ‘ 93    πŸ” 23    πŸ’¬ 4    πŸ“Œ 3

Thrilled to start 2026 as faculty in Psych & CS
@ualberta.bsky.social + Amii.ca Fellow! πŸ₯³ Recruiting students to develop theories of cognition in natural & artificial systems πŸ€–πŸ’­πŸ§ . Find me at #NeurIPS2025 workshops (speaking coginterp.github.io/neurips2025 & organising @dataonbrainmind.bsky.social)

06.12.2025 19:26 β€” πŸ‘ 103    πŸ” 27    πŸ’¬ 4    πŸ“Œ 1