Ann Huang's Avatar

Ann Huang

@annhuang42.bsky.social

Comp Neuro, ML, Dynamical Systems ๐Ÿง ๐Ÿค–PhD student at Harvard & Kempner Institute. Prev at McGill, Mila, EPFL.

90 Followers  |  87 Following  |  17 Posts  |  Joined: 28.03.2025  |  2.0922

Latest posts by annhuang42.bsky.social on Bluesky

Saw your work before, really cool work!

25.11.2025 17:59 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

oh excellent pointer! That indeed matches our intuition

24.11.2025 20:04 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Not all solutions are created equal: An analytical dissociation of... A foundational principle of connectionism is that perception, action, and cognition emerge from parallel computations among simple, interconnected units that generate and rely on neural...

And variance in the behavior is not necessarily coupled with that in the features; for example see this paper for a dissociation between the two openreview.net/forum?id=Yuc...

24.11.2025 19:56 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

What we found was that more consistent features during training does not guarantee more similar OOD behavior. In fact here stronger feature learning can lead to more variable OOD behavior, which we hypothesize was due to overfitting

24.11.2025 19:47 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Just looked at your paper, weโ€™re basically motivated by the same question applied to different architectures! will try to visit your poster too

24.11.2025 19:37 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

yay thanks Dan!!

24.11.2025 19:32 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Thanks to my amazing collaborators and my PI! @satpreetsingh.bsky.social @flavioh.bsky.social @kanakarajanphd.bsky.social

๐Ÿ”นPaper: arxiv.org/pdf/2410.03972
๐Ÿ”นPoster: Fri Dec 5, Poster #2001 at Exhibition Hall C, D, E

Happy to chat at NeurIPS or by email at annhuang@g.harvard.edu!

24.11.2025 16:43 โ€” ๐Ÿ‘ 8    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Our results:
- support the contravariance principle (Cao & @dyamins.bsky.social)
- reveal when weight- & dynamic-level variability move together (or opposite)
- give "knobs" for controlling degeneracy, whether you're studying shared mechanisms or individual variability in task-trained RNNs.

24.11.2025 16:43 โ€” ๐Ÿ‘ 6    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Post image

4๏ธโƒฃ Regularization (L1, low-rank)
Both types of structural regularization reduce degeneracy across all levels. Regularization nudges networks toward more consistent, shared solutions.

24.11.2025 16:43 โ€” ๐Ÿ‘ 5    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

3๏ธโƒฃ Network size
When we fix feature learning (using ยตP), larger RNNs converge to more consistent solutions at all levels โ€” weights, dynamics, and behavior.
A clean convergence-with-scale effect, demonstrated on RNNs across levels.

24.11.2025 16:43 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

We then causally tested feature learningโ€™s effect on degeneracy using ยตP scaling. Stronger feature learning reduces dynamical degeneracy & increases weight degeneracy (like harder tasks).
It also increases behavioral degeneracy under OOD inputs (likely due to overfitting).

24.11.2025 16:43 โ€” ๐Ÿ‘ 5    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

2๏ธโƒฃ Feature learning
Complex tasks push RNNs into feature learning, where the network has to adapt its internal weights and features to solve the task. Weights travel much farther from initialization, leading to more dispersed weights in the weight space (higher degeneracy).

24.11.2025 16:43 โ€” ๐Ÿ‘ 6    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

1๏ธโƒฃ Task complexity
As tasks get harder, we observe less degeneracy in dynamics/behavior, but more degeneracy in the weights.

When trained on harder tasks, RNNs converge to similar neural dynamics and OOD behavior, but their weight configurations diverge. Why?

24.11.2025 16:43 โ€” ๐Ÿ‘ 6    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Using 3,400 RNNs across 4 neuroscience-relevant tasks (flip-flop memory, working memory, pattern generation, path integration), we systematically varied:
- task complexity
- learning regime
- network size
- regularization

Our findings:

24.11.2025 16:43 โ€” ๐Ÿ‘ 5    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Our unified framework measures & controls degeneracy at 3 levels:
๐ŸŽฏ Behavior: variability in OOD performance
๐Ÿง  Dynamics: distance btwn neural trajectories, quantified by Dynamical Similarity Analysis
โš™๏ธ Weights: permutation-invariant Frobenius distance btwn recurrent weights

24.11.2025 16:43 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

RNNs trained from different seeds on the same task can show strikingly different internal solutions, even when they perform equally well. We call this solution degeneracy.

24.11.2025 16:43 โ€” ๐Ÿ‘ 6    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐Ÿ“Excited to share that our paper was selected as a Spotlight at #NeurIPS2025!

arxiv.org/pdf/2410.03972

It started from a question I kept running into:

When do RNNs trained on the same task converge/diverge in their solutions?
๐Ÿงตโฌ‡๏ธ

24.11.2025 16:43 โ€” ๐Ÿ‘ 97    ๐Ÿ” 26    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 3
Post image

Our next paper on comparing dynamical systems (with special interest to artificial and biological neural networks) is out!! Joint work with @annhuang42.bsky.social , as well as @satpreetsingh.bsky.social , @leokoz8.bsky.social , Ila Fiete, and @kanakarajanphd.bsky.social : arxiv.org/pdf/2510.25943

10.11.2025 16:16 โ€” ๐Ÿ‘ 67    ๐Ÿ” 23    ๐Ÿ’ฌ 4    ๐Ÿ“Œ 3

@annhuang42 is following 20 prominent accounts