Thrilled to see this work accepted at NeurIPS!
Kudos to @hafezghm.bsky.social for the heroic effort in demonstrating the efficacy of seq-JEPA in representation learning from multiple angles.
#MLSky π§ π€
19.09.2025 18:46 β π 18 π 4 π¬ 1 π 0
Excited to share that seq-JEPA has been accepted to NeurIPS 2025!
19.09.2025 18:02 β π 15 π 2 π¬ 2 π 2
Interestingly, seq-JEPA shows path integration capabilities β an important research problem in neuroscience. By observing a sequence of views and their corresponding actions, it can integrate the path connecting the initial view to the final view.
(9/10)
14.05.2025 12:52 β π 5 π 0 π¬ 1 π 0
Thanks to action conditioning, the visual backbone encodes rotation information which can be decoded from its representations, while the transformer encoder aggregates different rotated views, reduces intra-class variations (caused by rotations), and produces a semantic object representation.
8/10
14.05.2025 12:52 β π 3 π 0 π¬ 1 π 0
On 3D Invariant-Equivariant Benchmark (3DIEBench) where each object view has a different rotation, seq-JEPA achieves top performance on both invariance-related object categorization and equivariance-related rotation prediction w/o sacrificing one for the other.
(7/10)
14.05.2025 12:52 β π 2 π 0 π¬ 1 π 0
Seq-JEPA learns invariant-equivariant representations for tasks that contain sequential observations and transformations; e.g., it can learn semantic image representations by seeing a sequence of small image patches across simulated eye movements w/o hand-crafted augmentation or masking.
(6/10)
14.05.2025 12:52 β π 4 π 0 π¬ 1 π 0
Post-training, the model has learned two segregated representations:
An action-invariant aggregate representation
Action-equivariant individual-view representations
π‘No explicit equivariance loss or dual predictor required!
(5/10)
14.05.2025 12:52 β π 5 π 0 π¬ 1 π 0
Inspired by this, we designed seq-JEPA which processes sequences of views and their relative transformations (actions).
β‘οΈ A transformer encoder aggregates these action-conditioned view representations to predict a yet unseen view.
(4/10)
14.05.2025 12:52 β π 4 π 1 π¬ 1 π 0
π§ Humans learn to recognize new objects by moving around them, manipulating them, and probing them via eye movements. Different views of a novel object are generated through actions (manipulations & eye movements) that are then integrated to form new concepts in the brain.
(3/10)
14.05.2025 12:52 β π 3 π 0 π¬ 1 π 0
Current SSL methods face a trade-off: optimizing for transformation invariance in representational space (useful in high-level classification) often reduces equivariance (needed for tasks related to details like object rotation & movement). Our world model, seq-JEPA, resolves this trade-off.
2/10
14.05.2025 12:52 β π 5 π 0 π¬ 1 π 0
Preprint Alert π
Can we simultaneously learn transformation-invariant and transformation-equivariant representations with self-supervised learning?
TL;DR Yes! This is possible via simple predictive learning & architectural inductive biases β without extra loss terms and predictors!
π§΅ (1/10)
14.05.2025 12:52 β π 51 π 16 π¬ 1 π 5
Assistant Professor of Machine Learning, Carnegie Mellon University (CMU)
Building a Natural Science of Intelligence π§ π€β¨
Prev: ICoN Postdoctoral Fellow @MIT, PhD @Stanford NeuroAILab
Personal Website: https://cs.cmu.edu/~anayebi
Neuroscientist @ Academia Sinica, NPAS, IBMS; National Taiwan University LS | In search of Biophysics-informed neural and behavioral algorithms | Hippocampus, Memory, Neural code for space and time
on X
https://x.com/hiallen72?s=21&t=WrQo8yo4jXdkrU8t8qeRRw
PhD student, Princeton Neuroscience Institute
Current interests: social behavior, recurrent neural networks, computational ethology
Assist. prof. at UniversitΓ© de Montreal and Mila Β· machine learning for science Β· climate change and health Β· open science Β· he/Γ©l/il #PalestinianLivesMatter π
alexhernandezgarcia.github.io
Post-doc at Mila & U de MontrΓ©al in Guillaume Lajoie & Matt Perich's labs
Focus on neuroscience, RL for motor learning, neural control of movement, NeuroAI.
Ph.D. student @ IUI. Studying #MachineLearning, #NeuroSymbolicAI
Neuroscience PhD candidate in the Simoncelli and Chung labs at NYU. Trying to force computers to see the way people do.
Flatiron Research Fellow #FlatironCCN. PhD from #mitbrainandcog. Incoming Asst Prof #CarnegieMellon in Fall 2025. I study how humans and computers hear and see.
PhD Student @UMontreal @Mila | Neural intelligence β Machine intelligence | My Bio: CompSci + Math β AI & CogSci β neural TTS β CompNeuro β Neuro + AI | https://sites.google.com/view/sungjaecho
Mathematical Neuroscientist. Biologically Inclined.
Postdoctoral Researcher @ NeuroPsi (CNRS, Paris-Saclay).
Modelling psychedelic- and stimulation-induced shifts in consciousness.
Theorising about computation, causation, information, and emergence.
Assistant Professor, McGill University | Associate Academic Member, Mila - Quebec AI Institute | Neuroscience and AI, learning and inference, dopamine and cognition
https://massetlab.org/
theory of neural networks for natural and artificial intelligence
https://pehlevan.seas.harvard.edu/
Computational neuroscientist bringing machine and neural learning closer together @ox.ac.uk @oxforddpag.bsky.social π΅πΉπͺπΊπ¬π§π³
neuralml.github.io
NeuroAI, vision, open science. NeuroAI researcher at Amaranth Foundation. Previously engineer @ Google, Meta, Mila. Updates from http://neuroai.science
Neuroscience & AI at University of Oxford and University of Cambridge | Principles of efficient computations + learning in brains, AI, and silicon π§ NeuroAI | Gates Cambridge Scholar
www.jachterberg.com
AI PhDing at Mila/McGill (prev FAIR intern). Happily residing in Montreal π₯―βοΈ
Academic: language grounding, vision+language, interp, rigorous & creative evals, cogsci
Other: many sports, urban explorations, puzzles/quizzes
bennokrojer.com
Artificial Intelligence, Machine Learning, Neuroscience, Complex Systems, Economics.
PhD Student at the University of Tehran.
Cofounder: @AutocurriculaLab, @NeuroAILab, @LangTechAI.
https://sites.google.com/a/umich.edu/aslansdizaji/
PhD candidate @real.itu.dk
Artificial Life | Complex Systems | Neural Networks
Computational Cognitive Scientist π§ π€ β’ NeuroAI, Predictive Coding, RL & Deep Learning, Complex Systems β’ Postdoc at @siegellab.bsky.social, @unituebingen.bsky.social β’ Husband & Dad
π https://scholar.google.com/citations?hl=en&user=k5eR8_oAAAAJ
Ask me about Reinforcement Learning
Research @ Sony AI
AI should learn from its experiences, not copy your data.
My website for answering RL questions: https://www.decisionsanddragons.com/
Views and posts are my own.