π’π’Β Announcing this year's conference on the Mathematics of Neuroscience & AI (Rome, 9-12th June). Weβve got a stellar line-up and venue, and invite everyone to join:
www.neuromonster.org
π’π’Β Announcing this year's conference on the Mathematics of Neuroscience & AI (Rome, 9-12th June). Weβve got a stellar line-up and venue, and invite everyone to join:
www.neuromonster.org
π’ Job alert - Deep Learning Theory & AI Safety
Applications open for a postdoc fellow (@saxelab.bsky.social lab) to study artificial deep networks using techniques from applied maths & stat physics.
β° Deadline: 26 Mar 2026
π€ In collaboration with @stefsm.bsky.social
βΉοΈ www.ucl.ac.uk/life-science...
Excited to be co-organising a #cosyne2026 workshop with Alison Comrie on 'algorithms for learning from scratch'! With a great line-up of speakers, we'll be tackling the question of what processes enable naive biological & artificial agents to adapt to new situations. Info here: tinyurl.com/4u8enf7k
24.02.2026 18:33 β π 35 π 15 π¬ 1 π 0
π’ Weβre now accepting applications for the 2026 School on Analytical Connectionism dedicated this year to Language Acquisition.
π Gothenburg, Sweden
ποΈ August 17β28, 2026
β οΈ Apply by April 17!
π analytical-connectionism.net/school/2026/
π Meet the experts joining us this summer!
Thrilled to finally share this work! π§ π
Using a new reinforcement-free task we show mice (like humans) extract abstract structure from sound (unsupervised) & dCA1 is causally required by building factorised, orthogonal subspaces of abstract rules.
Led by Dammy Onih!
www.biorxiv.org/content/10.6...
How to apply:
Salary: USD 80,000β100,000 (50-74k GBP) annualised
Initial contract: 6 months, w/ extension based on funding
Details: docs.google.com/document/d/1...
Application: forms.gle/xKukH74iX16p...
4
Weβre hiring postdocs/research scientists! Your interests can be anywhere on the spectrum from pure theory to empirically testing predictions relevant to AI safety.
Our theoretical work relies on dynamical systems and tools from statistical physics.
3
We avoid many unwanted outcomes in the physical world using our knowledge of physics, and basic deep learning theory should eventually enable the same for AI.
We focus on simple, analytically tractable βmodel organismsβ that capture essential learning dynamics and behaviours.
2
Excited to launch Principia, a nonprofit research organisation at the intersection of deep learning theory and AI safety.
Our goal is to develop theory for modern machine learning systems that can help us understand complex network behaviors, including those critical for AI safety and alignment.
1
Our paper is out in @natneuro.nature.com!
www.nature.com/articles/s41...
We develop a geometric theory of how neural populations support generalization across many tasks.
@zuckermanbrain.bsky.social
@flatironinstitute.org
@kempnerinstitute.bsky.social
1/14
A great question, I'm not sure. It's important to understand if muon shares similar inductive biases.
05.02.2026 16:08 β π 1 π 0 π¬ 0 π 0I agree, there seem to be connections but it's not fully clear to me why. SLT is a static theory, and yet Daniel Murfet and others have shown that the stages we see also correspond to SLT posteriors of increasing complexity.
05.02.2026 16:05 β π 1 π 0 π¬ 1 π 0
Upcoming online talk next Monday 9th February, at the ELLIS Reading Group on Mathematics & Efficiency of Deep Learning!
Open to all. Info at
sites.google.com/view/efficie...
Equipped with this theory, we make new predictions about how network width, data distribution, and initialization affect learning dynamics. For example, increasing the number of attention heads in linear attention shortens the plateaus in learning.
03.02.2026 16:19 β π 6 π 0 π¬ 1 π 0So when progressing simple -> complex, linear networks learn solutions of increasing rank, ReLU networks learn solutions with increasing kinks, convolutional networks learn solutions with increasing convolutional kernels, and attention models learn solutions with increasing heads.
03.02.2026 16:19 β π 5 π 0 π¬ 1 π 0Here the notion of simplicity is the number of effective units in the architecture: hidden neurons, convolutional kernels, or attention heads.
03.02.2026 16:19 β π 5 π 0 π¬ 1 π 0
Finally, we demonstrate gradient descent sometimes naturally evolves along the connecting paths between saddles iteratively, yielding saddle-to-saddle dynamics.
We identify two distinct mechanisms: timescale separation between directions or units, depending on the architecture.
We then show that saddles are connected by gradient descent paths (invariant manifolds).
Along these paths, a larger network behaves like a smaller one, retaining the same simplicity during a saddle-to-saddle transition.
We first show that saddle points are ubiquitous in the loss landscape: fixed points of smaller networks can be embedded as saddle points of larger networks, yielding a nested hierarchy of saddles.
These saddles exist in any network that contains a sum of repeated units.
We present a theoretical framework that explains dynamical simplicity bias arising from saddle-to-saddle learning dynamics across neural network architectures:
Fully-connected, convolutional, attention-based, and more.
yedizhang.github.io/simplicity
Why donβt neural networks learn all at once, but instead progress from simple to complex solutions? And what does βsimpleβ even mean across different neural network architectures?
Sharing our new paper @iclr_conf led by Yedi Zhang with Peter Latham
arxiv.org/abs/2512.20607
Applications for 2026 entry to the Gatsby Bridging Programme (7-week maths summer school) will open on 19 Jan and close on 16 Feb. Designed for students who wish to pursue a postgrad research degree in theoretical neuroscience or foundational machine learning but whose degree programme lacks a strong maths focus. Applications from students in underrepresented groups in STEM strongly encouraged. A small number of bursaries available. Register for the information webinar on 23 Jan.
π’ Applications open on 19 Jan for the 7-week #Mathematics #SummerSchool in London. You will develop the maths skills and intuition necessary to enter the #TheoreticalNeuroscience / #MachineLearning field.
Find out more & register for the information webinar π www.ucl.ac.uk/life-science...
Really thrilled that this paper led by @neurozz.bsky.social is now published in its final version in @elife.bsky.social!!
This is a memory-focused (as opposed to RL-focused) account of the detailed characteristics of forward and backward awake and sleep replay!
elifesciences.org/articles/99931
Our paper on the "Oneirogen hypothesis" is now up in its revised form on eLife!
This is the hypothesis that psychedelics induce a dream-like state, which we show via modelling could explain a variety of perceptual and learning effects from such drugs.
elifesciences.org/reviewed-pre...
π§ π π§ͺ
By the way, if youβre interested in working together on problems like this, Iβm starting my lab at UCSF this summer. Get in touch if youβre interested in doing a postdoc! More info here: wj2.github.io/postdoc_ad (7/7)
09.01.2026 19:06 β π 29 π 14 π¬ 1 π 3New preprint. We show that in addition to reward prediction errors (RPEs), dorsal striatal dopamine signals encode sensory prediction errors (SPEs), the difference between sensory prior & observed stimulus. www.biorxiv.org/content/10.6...
05.01.2026 10:49 β π 86 π 26 π¬ 3 π 1
Sleep dependent consolidation and replay that doesnβt require the hippocampus?
Very beautiful work by Marcus Stephenson-Jonesβ lab on sleep driven sequential skill consolidation in the striatum.
www.biorxiv.org/content/10.1...
Thrilled to start 2026 as faculty in Psych & CS
@ualberta.bsky.social + Amii.ca Fellow! π₯³ Recruiting students to develop theories of cognition in natural & artificial systems π€ππ§ . Find me at #NeurIPS2025 workshops (speaking coginterp.github.io/neurips2025 & organising @dataonbrainmind.bsky.social)
Find us at NeurIPS, Thur 4:30 pm #2115! We know networks have to be both plastic and stable but we're used to thinking about computations, such as memory, as additional requirements. Instead, we find that almost all stable & plastic networks display simple memory abilities.
01.12.2025 21:57 β π 17 π 2 π¬ 1 π 0
1/6 New preprint π How does the cortex learn to represent things and how they move without reconstructing sensory stimuli? We developed a circuit-centric recurrent predictive learning (RPL) model based on JEPAs.
π doi.org/10.1101/2025...
Led by @atenagm.bsky.social @mshalvagal.bsky.social