David Heurtel-Depeiges's Avatar

David Heurtel-Depeiges

@heurteldepeiges.bsky.social

1st Year PhD Student @ Mila under the supervision of Sarath Chandar

13 Followers  |  546 Following  |  5 Posts  |  Joined: 22.11.2024  |  1.7784

Latest posts by heurteldepeiges.bsky.social on Bluesky


The Expressive Limits of Diagonal SSMs for State-Tracking State-Space Models (SSMs) have recently been shown to achieve strong empirical performance on a variety of long-range sequence modeling tasks while remaining efficient and highly-parallelizable....

๐Ÿ“ openreview.net/forum?id=5bg...
Joint work of Mehran Shakerinava, Behnoush Khavari, Siamak Ravanbakhsh and @sarath-chandar.bsky.social @mila-quebec.bsky.social .

10.02.2026 16:54 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Video thumbnail

New work, just accepted @ICLR: "The Expressive Limits of Diagonal SSMs for State-Tracking"
We give a complete characterization of what diagonal SSMs can and cannot compute on state-tracking tasks and the answer is deeply connected to group theory.
๐Ÿงต๐Ÿ‘‡

10.02.2026 16:54 โ€” ๐Ÿ‘ 2    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
NeoBERT: A Next-Generation BERT Recent innovations in architecture, pre-training, and fine-tuning have led to the remarkable in-context learning and reasoning abilities of large auto-regressive language models such as LLaMA and Deep...

NeoBERT: A Next-Generation BERT (TMLR Journal-to-Conference Track)

We modernized BERT (RoPE, SwiGLU, 4k context). At just 250M params, it outperforms RoBERTa and ModernBERT on the MTEB benchmark.

๐Ÿ“„ arxiv.org/abs/2502.19587

03.02.2026 15:02 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning Reinforcement learning (RL) has recently become a strong recipe for training reasoning LLMs that produce long chains of thought (LongCoT). Yet the standard RL "thinking environment", where the state i...

The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning

We achieve linear-complexity reasoning. Our "Delethink" decouples thought length from context, matching LongCoT performance with โ‰ˆ25% of the compute.

๐Ÿ“„ arxiv.org/abs/2510.06557

03.02.2026 15:02 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
The Expressive Limits of Diagonal SSMs for State-Tracking State-Space Models (SSMs) have recently been shown to achieve strong empirical performance on a variety of long-range sequence modeling tasks while remaining efficient and highly-parallelizable....

The Expressive Limits of Diagonal SSMs for State-Tracking

We prove a tight bound: Diagonal SSMs are theoretically incapable of tracking non-Abelian groups. A critical look at where efficient models fail vs. where they succeed.

๐Ÿ“„ openreview.net/forum?id=5bg...

03.02.2026 15:02 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Excited to share that we have 3 papers accepted at #ICLR2026! ๐Ÿ‡ง๐Ÿ‡ท

Our work this year focuses on efficiency and expressivity: deriving theoretical limits for SSMs, achieving linear scaling for reasoning, and modernizing encoder architectures.

A summary of our work ๐Ÿ‘‡ ๐Ÿงต

03.02.2026 15:02 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

At Chandar Lab, we are happy to announce the third edition of our assistance program to provide feedback for members of communities underrepresented in AI who want to apply to high-profile graduate programs. Want feedback? Details: chandar-lab.github.io/grad_app/. Deadline: Nov 01!

03.10.2025 15:20 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

Have a look at NovoMolGen implementation on our lab's HF page! Easy to work with and generate new molecules in no time.

08.09.2025 17:14 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

We just made NovoMolGen easy to play with: Transformers-native checkpoints on the Hub and small notebooks that let you load, sample, and fine-tune in minutes. The few lines of code that load the model, plug in a reward, run a short RL finetune, and plot the curve.

08.09.2025 16:07 โ€” ๐Ÿ‘ 3    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

Collaborative Multi Agent Reinforcement Learning is key for AI in the future. Check out R3D2, a generalist agent working on text-based Hanabi, accepted at ICLR 2025.

Website: chandar-lab.github.io/R3D2-A-Gener...

04.04.2025 17:16 โ€” ๐Ÿ‘ 2    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I am excited to share that our BindGPT paper won the best poster award at #AAAI2025! Congratulations to the team! Work led by @artemzholus.bsky.social!

05.03.2025 14:54 โ€” ๐Ÿ‘ 9    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
chandar-lab/NeoBERT ยท Hugging Face Weโ€™re on a journey to advance and democratize artificial intelligence through open source and open science.

The best part? We are open-sourcing everything, including the intermediary model checkpoints. The main model is already on HuggingFace, be sure to check it out! (6/n)

Model: huggingface.co/chandar-lab/...
Paper: arxiv.org/abs/2502.19587
Code and checkpoints to be released soon!

28.02.2025 16:30 โ€” ๐Ÿ‘ 7    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

NeoBERT is very strong compared to all baselines, is fully open source and open weights (including intermediary checkpoints)...and it has higher tokens/s throughput. Give it a try and substitute your favorite encoder with this new model!

28.02.2025 16:40 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Great work by great colleagues! Have a look at the paper.

28.02.2025 16:38 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Hi Rupali, could you please add me? Thanks a lot.

27.11.2024 16:48 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@heurteldepeiges is following 20 prominent accounts