This paper on how the brain may do gradient descent is very cool: www.nature.com/articles/s41...
26.02.2026 03:02 β π 136 π 41 π¬ 2 π 2@shahabbakht.bsky.social
|| assistant prof at University of Montreal || leading the systems neuroscience and AI lab (SNAIL: https://www.snailab.ca/) π || associate academic member of Mila (Quebec AI Institute) || #NeuroAI || vision and learning in brains and machines
This paper on how the brain may do gradient descent is very cool: www.nature.com/articles/s41...
26.02.2026 03:02 β π 136 π 41 π¬ 2 π 2Our new paper is now out showing how time perception in animals is linked to their ecology. Using data from 237 species we show temporal perception is faster in species that fly and pursuit predators www.nature.com/articles/s41... π
24.02.2026 13:22 β π 119 π 53 π¬ 3 π 2This study is super cool (connecting ecology and perception), that suggest some aspects of animal's perception (temporal precision) is shaped by their environment (which somehow resonates w our proposal on internal foraging perspectives on perceptual selection www.sciencedirect.com/science/arti...)
24.02.2026 13:51 β π 28 π 10 π¬ 0 π 0This is a critical methodological point about the Platonic Representation Hypothesis paper.
I mistakenly thought the PRH paper used CKA as its main similarity metric.
Another motivation for thinking more deeply about metrics of similarity and alignment.
Though itβs quite interesting that this subtle methodological detail turned out to be so important in the main message.
25.02.2026 03:10 β π 3 π 0 π¬ 1 π 0Thanks for the correction!
Yes, in the main text, your paper mainly relied on local similarity.
Now actually I rememberer, my first reaction reading your paper was why CKA results werenβt used in the main text.
What you describe sounds like the "lumpers vs. splitters" in this paper from @summerfieldlab.bsky.social lab: lumpers generalize more/retain less, and splitters generalize less/forget more. They gave a nice explanation based on rich vs lazy training regimes in ANNs.
www.nature.com/articles/s41...
Looking forward to reading the promised post on continual learning, @lampinen.bsky.social :)
24.02.2026 01:15 β π 0 π 0 π¬ 0 π 0Cool! This is generative inferenceβs prediction for human perception in this illusion: the squares are no longer squares!
Try it for yourself here:
huggingface.co/spaces/ttoos...
What is the relationship between memorization and generalization in AI? Is there a fundamental tradeoff? In infinitefaculty.substack.com/p/memorizati... Iβve reviewed some of the evolving perspectives on memorization & generalization in machine learning, from classic perspectives through LLMs.
18.02.2026 15:54 β π 132 π 27 π¬ 4 π 5Another good reason for being cautious with representational similarity analysis: arxiv.org/abs/2602.14486
The famous Platonic Representation Hypothesis was largely driven by CKAβs bias.
But the hypothesis still holds for shared local relationships.
The revised version of our paper on the impact of top-down feedback is now out @elife.bsky.social:
doi.org/10.7554/eLif...
tl;dr: we show that using human-brain-like feedback/anatomy in a deep RNN leads to human-like visual biases!
This work was led by @tmshbr.bsky.social
#NeuroAI π§ π π§ͺ
Excited to launch Principia, a nonprofit research organisation at the intersection of deep learning theory and AI safety.
Our goal is to develop theory for modern machine learning systems that can help us understand complex network behaviors, including those critical for AI safety and alignment.
1
Thrilled to finally share this work! π§ π
Using a new reinforcement-free task we show mice (like humans) extract abstract structure from sound (unsupervised) & dCA1 is causally required by building factorised, orthogonal subspaces of abstract rules.
Led by Dammy Onih!
www.biorxiv.org/content/10.6...
I donβt think AIβs success in coding will automatically translate to other fields. That level of performance only works where the output is as easily verifiable as code; and not many domains fit that bill. 2/2
11.02.2026 16:16 β π 5 π 1 π¬ 0 π 0"The experience that tech workers have had over the past year, of watching AI go from βhelpful toolβ to βdoes my job better than I doβ, is the experience everyone else is about to have. Law, finance, medicine, accounting, β¦"
Iβm not sure β¦ 1/2
fortune.com/2026/02/11/s...
β¦ especially whenever controversies around representational similarity resurface.
11.02.2026 15:06 β π 0 π 0 π¬ 0 π 0Youβre comparing two fields at very different stages of theoretical maturity. Neuroscience (and NeuroAI) are still largely pre-theoretic. I often return to Hasok Changβs Inventing Temperature as a parallel for where we actually stand in theoretical neuroscience, β¦
11.02.2026 15:06 β π 0 π 0 π¬ 1 π 0Definitely not enough.
10.02.2026 22:05 β π 1 π 0 π¬ 0 π 0My bet is on the ecological relevance of training data and temporal prediction as the core objective.
Architecture is difficult to constrain, given that ANNs and brains rely on substantially different functional mechanisms.
Thatβs just my view, though; I could be wrong.
Exactly my point. The emerging view seems to be that, assuming equal trainability (big assumption though), the architectures may not play as big of a role as the training objective and data.
10.02.2026 21:28 β π 1 π 0 π¬ 2 π 0The problem is the training objective and lack of recurrence, not the 50-layer architecture.
10.02.2026 21:11 β π 1 π 0 π¬ 1 π 0Also see @mschrimpf.bsky.social thread here: bsky.app/profile/msch...
10.02.2026 18:43 β π 3 π 0 π¬ 0 π 0My two cents: as with any discipline, we use tools to probe phenomena scientifically while simultaneously striving to understand those tools better
An iterative process that sees the works like the one above within its paradigm not standing outside of it.
I encourage everyone to read the paper and rather than relying on social media impressions.
Itβs an excellent paper speaking about a specific category of work in NeuroAI that is still hard to generalize to the whole field.
You mean a framework that could build models of artificial stimuli would be sufficient?
10.02.2026 18:25 β π 0 π 0 π¬ 1 π 0Great point.
Iβd use classical stimuli for testing out-of-distribution generalization instead of model development.
This would actually be the exact opposite of what was proposed in "In praise of artifice"
www.nature.com/articles/nn1...
Our paper is out in @natneuro.nature.com!
www.nature.com/articles/s41...
We develop a geometric theory of how neural populations support generalization across many tasks.
@zuckermanbrain.bsky.social
@flatironinstitute.org
@kempnerinstitute.bsky.social
1/14
I agree. My main point is that decompositionality (whether or not it supports modularity) is baked into classical stimuli a priori. These stimuli then act as an inductive bias in models developed to capture the resulting neural or behavioral responses.
10.02.2026 16:25 β π 1 π 0 π¬ 1 π 0n other words, classic stimuli rely on a strong notion of modularity that may or may not hold for naturalistic stimuli, where visual features are inherently intermixed.
10.02.2026 16:12 β π 1 π 0 π¬ 1 π 0