NeurIPS 2025 Workshop DBM
Welcome to the OpenReview homepage for NeurIPS 2025 Workshop DBM
๐จ Deadline Extended ๐จ
The submission deadline for the Data on the Brain & Mind Workshop (NeurIPS 2025) has been extended to Sep 8 (AoE)! ๐ง โจ
We invite you to submit your findings or tutorials via the OpenReview portal:
openreview.net/group?id=Neu...
27.08.2025 19:45 โ ๐ 4 ๐ 2 ๐ฌ 0 ๐ 0
Data on the Brain & Mind
๐ข 10 days left to submit to the Data on the Brain & Mind Workshop at #NeurIPS2025!
๐ Call for:
โข Findings (4 or 8 pages)
โข Tutorials
If youโre submitting to ICLR or NeurIPS, consider submitting here tooโand highlight how to use a cog neuro dataset in our tutorial track!
๐ data-brain-mind.github.io
25.08.2025 15:43 โ ๐ 8 ๐ 5 ๐ฌ 0 ๐ 0
๐จ Excited to announce our #NeurIPS2025 Workshop: Data on the Brain & Mind
๐ฃ Call for: Findings (4- or 8-page) + Tutorials tracks
๐๏ธ Speakers include @dyamins.bsky.social @lauragwilliams.bsky.social @cpehlevan.bsky.social
๐ Learn more: data-brain-mind.github.io
04.08.2025 15:28 โ ๐ 31 ๐ 10 ๐ฌ 0 ๐ 3
Language Models in Plato's Cave
Why language models succeeded where video models failed, and what that teaches us about AI
This is an excellent and very clear piece from Sergey Levine about the strengths and limitations of Large Language models.
sergeylevine.substack.com/p/language-m...
12.06.2025 16:30 โ ๐ 41 ๐ 11 ๐ฌ 2 ๐ 1
Normalizing Flows (NFs) check all boxes for RL: exact likelihoods (imitation learning), efficient sampling (real-time control), and variational inference (Q-learning)! Yet they are overlooked over more expensive and less flexible contemporaries like diffusion models.
Are NFs fundamentally limited?
05.06.2025 17:05 โ ๐ 5 ๐ 1 ๐ฌ 1 ๐ 1
How can agents trained to reach (temporally) nearby goals generalize to attain distant goals?
Come to our #ICLR2025 poster now to discuss ๐ฉ๐ฐ๐ณ๐ช๐ป๐ฐ๐ฏ ๐จ๐ฆ๐ฏ๐ฆ๐ณ๐ข๐ญ๐ช๐ป๐ข๐ต๐ช๐ฐ๐ฏ!
w/ @crji.bsky.social and @ben-eysenbach.bsky.social
๐Hall 3 + Hall 2B #637
26.04.2025 02:12 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
๐จOur new #ICLR2025 paper presents a unified framework for intrinsic motivation and reward shaping: they signal the value of the RL agentโs state๐ค=external state๐+past experience๐ง . Rewards based on potentials over the learning agentโs state provably avoid reward hacking!๐งต
26.03.2025 00:05 โ ๐ 10 ๐ 3 ๐ฌ 1 ๐ 1
Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in
Robot Instruction Following
Temporal Representation Alignment
Thanks to incredible collaborators Bill Zheng, Anca Dragan, Kuan Fang, and Sergey Levine!
Website: tra-paper.github.io
Paper: arxiv.org/pdf/2502.05454
14.02.2025 01:39 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
...but to create truly autonomous self-improving agents, we must not only imitate, but also ๐ช๐ฎ๐ฑ๐ณ๐ฐ๐ท๐ฆ upon the training capabilities. Our findings suggest that this improvement might emerge from better task representations, rather than more complex learning algorithms. 7/
14.02.2025 01:39 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
๐๐ฉ๐บ ๐ฅ๐ฐ๐ฆ๐ด ๐ต๐ฉ๐ช๐ด ๐ฎ๐ข๐ต๐ต๐ฆ๐ณ? Recent breakthroughs in both end-to-end robot learning and language modeling have been enabled not through complex TD-based reinforcement learning objectives, but rather through scaling imitation with large architectures and datasets... 6/
14.02.2025 01:39 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
We validated this in simulation. Across offline RL benchmarks, imitation using our TRA task representations outperformed standard behavioral cloning-especially for stitching tasks. In many cases, TRA beat "true" value-based offline RL, using only an imitation loss. 5/
14.02.2025 01:39 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Successor features have long been known to boost RL generalization (Dayan, 1993). Our findings suggest something stronger: successor task representations produce emergent capabilities beyond training even without RL or explicit subtask decomposition. 4/
14.02.2025 01:39 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
This trick encourages a form of time invariance during learning: both nearby and distant goals are represented similarly. By additionally aligning language instructions ๐(โ) to the goal representations ๐(๐), the policy can also perform new compound language tasks. 3/
14.02.2025 01:39 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
What does temporal alignment mean? When training, our policy imitates the human actions that lead to the end goal ๐ of a trajectory. Rather than training on the raw goals, we use a representation ๐(๐) that aligns with the preceding state โsuccessor featuresโ ๐(๐ ). 2/
14.02.2025 01:39 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Current robot learning methods are good at imitating tasks seen during training, but struggle to compose behaviors in new ways. When training imitation policies, we found something surprisingโusing temporally-aligned task representations enabled compositional generalization. 1/
14.02.2025 01:39 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
Excited to share new work led by @vivekmyers.bsky.social and @crji.bsky.social that proves you can learn to reach distant goals by solely training on nearby goals. The key idea is a new form of invariance. This invariance implies generalization w.r.t. the horizon.
06.02.2025 01:13 โ ๐ 13 ๐ 3 ๐ฌ 0 ๐ 0
Want to see an agent carry out long horizons tasks when only trained on short horizon trajectories?
We formalize and demonstrate this notion of *horizon generalization* in RL.
Check out our website! horizon-generalization.github.io
04.02.2025 20:50 โ ๐ 11 ๐ 4 ๐ฌ 0 ๐ 0
What does this mean in practice? To generalize to long-horizon goal-reaching behavior, we should consider how our GCRL algorithms and architectures enable invariance to planning. When possible, prefer architectures like quasimetric networks (MRN, IQE) that enforce this invariance. 6/
04.02.2025 20:37 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Empirical results support this theory. The degree of planning invariance and horizon generalization is correlated across environments and GCRL methods. Critics parameterized as a quasimetric distance indeed tend to generalize the most over horizon. 5/
04.02.2025 20:37 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Similar to how CNN architectures exploit the inductive bias of translation-invariance for image classification, RL policies can enforce planning invariance by using a *quasimetric* critic parameterization that is guaranteed to obey the triangle inequality. 4/
04.02.2025 20:37 โ ๐ 3 ๐ 0 ๐ฌ 1 ๐ 1
The key to achieving horizon generalization is *planning invariance*. A policy is planning invariant if decomposing tasks into simpler subtasks doesn't improve performance. We prove planning invariance can enable horizon generalization. 3/
04.02.2025 20:37 โ ๐ 4 ๐ 2 ๐ฌ 1 ๐ 0
Certain RL algorithms are more conducive to horizon generalization than others. Goal-conditioned (GCRL) methods with a bilinear critic ฯ(๐ )แตฯ(๐) as well as quasimetric methods better-enable horizon generalization. 2/
04.02.2025 20:37 โ ๐ 3 ๐ 0 ๐ฌ 1 ๐ 0
Reinforcement learning agents should be able to improve upon behaviors seen during training.
In practice, RL agents often struggle to generalize to new long-horizon behaviors.
Our new paper studies *horizon generalization*, the degree to which RL algorithms generalize to reaching distant goals. 1/
04.02.2025 20:37 โ ๐ 34 ๐ 7 ๐ฌ 1 ๐ 3
Learning to Assist Humans without Inferring Rewards
Website: empowering-humans.github.io
Paper: arxiv.org/abs/2411.02623
Many thanks to wonderful collaborators Evan Ellis, Sergey Levine, Benjamin Eysenbach, and Anca Dragan!
22.01.2025 02:17 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Effective empowerment could also be combined with other objectives (e.g., RLHF), to improve assistance and promote safety (prevent human disempowerment). 6/
22.01.2025 02:17 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
In principle, this approach provides a general way to align RL agents from human interactions without needing human feedback or other rewards. 5/
22.01.2025 02:17 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
We show that optimizing this human effective empowerment helps in assistive settings. Theoretically, we show that maximizing the effective empowerment optimizes an (average-case) lower bound the human's utility/reward/objective under a uninformative prior. 4/
22.01.2025 02:17 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Our recent paper, "Learning to Assist Humans Without Inferring Rewards," proposes a scalable contrastive estimator for human empowerment. The estimator learns successor features to model the effects of a human's actions on the environment, approximating the *effective empowerment*. 3/
22.01.2025 02:17 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
This distinction is subtle but important. An agent that maximizes a misspecified model of the human's reward or seeks power for itself can lead to arbitrarily bad outcomes where the human becomes disempowered. Maximizing human empowerment avoids this. 2/
22.01.2025 02:17 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Postdoc at Helmholtz Munich (Schulz lab) and MPI for Biological Cybernetics (Dayan lab) || Ph.D. from EPFL (Gerstner lab) || Working on computational models of learning and decision-making in the brain; https://sites.google.com/view/modirsha
Senior Research Fellow @ ucl.ac.uk/gatsby & sainsburywellcome.org
{learning, representations, structure} in ๐ง ๐ญ๐ค
my work ๐ค: eringrant.github.io
not active: sigmoid.social/@eringrant @eringrant@sigmoid.social, twitter.com/ermgrant @ermgrant
Workshop at #NeurIPS2025 aiming to connect machine learning researchers with neuroscientists and cognitive scientists by focusing on concrete, open problems grounded in emerging neural and behavioral datasets.
๐ https://data-brain-mind.github.io
(he/him) Assistant Professor of Cognitive Science at Central European University in Vienna, PI of the CEU Causal Cognition Lab (https://ccl.ceu.edu) #CogSci #PsychSciSky #SciSky
Personal site: https://www.jfkominsky.com
PhD student in Reinforcement Learning at KTH Stockholm ๐ธ๐ช
https://www.kth.se/profile/stesto
https://www.linkedin.com/in/stojanovic-stefan/
AI PhD student at Berkeley
alyd.github.io
Alignment and coding agents. AI @ BAIR | Scale | Imbue
http://evanellis.com/
PhD @ Warsaw University of Technology
JacGCRL
Reinforcement Learning / Continual Learning / Neural Networks Plasticity
princeton physics phd
mit '23 physics + math
RL, interpretable AI4Science, stat phys
Assistant professor at Princeton CS working on reinforcement learning and AI/ML.
Site: https://ben-eysenbach.github.io/
Lab: https://princeton-rl.github.io/
Persian psychologist
Therapist for depression and anxiety disorders.
Research Scientist at DeepMind. Opinions my own. Inventor of GANs. Lead author of http://www.deeplearningbook.org . Founding chairman of www.publichealthactionnetwork.org
VP of Research, GenAI @ Meta (Multimodal LLMs, AI Agents), UPMC Professor of Computer Science at CMU, ex-Director of AI research at @Apple, co-founder Perceptual Machines (acquired by Apple)
Google Chief Scientist, Gemini Lead. Opinions stated here are my own, not those of Google. Gemini, TensorFlow, MapReduce, Bigtable, Spanner, ML things, ...
PhD student @uwnlp.bsky.social @uwcse.bsky.social | visiting researcher @MetaAI | previously @jhuclsp.bsky.social
https://stellalisy.com
AI interp @UC Berkeley | prev. MIT
jiahai-feng.github.io
Research Director, Founding Faculty, Canada CIFAR AI Chair @VectorInst.
Full Prof @UofT - Statistics and Computer Sci. (x-appt) danroy.org
I study assumption-free prediction and decision making under uncertainty, with inference emerging from optimality.
Assistant Professor at UW and Staff Research Scientist at Google DeepMind. Social Reinforcement Learning in multi-agent and human-AI interactions. PhD from MIT. Check out https://socialrl.cs.washington.edu/ and https://natashajaques.ai/.