๐ openreview.net/forum?id=5bg...
Joint work of Mehran Shakerinava, Behnoush Khavari, Siamak Ravanbakhsh and @sarath-chandar.bsky.social @mila-quebec.bsky.social .
@heurteldepeiges.bsky.social
1st Year PhD Student @ Mila under the supervision of Sarath Chandar
๐ openreview.net/forum?id=5bg...
Joint work of Mehran Shakerinava, Behnoush Khavari, Siamak Ravanbakhsh and @sarath-chandar.bsky.social @mila-quebec.bsky.social .
New work, just accepted @ICLR: "The Expressive Limits of Diagonal SSMs for State-Tracking"
We give a complete characterization of what diagonal SSMs can and cannot compute on state-tracking tasks and the answer is deeply connected to group theory.
๐งต๐
NeoBERT: A Next-Generation BERT (TMLR Journal-to-Conference Track)
We modernized BERT (RoPE, SwiGLU, 4k context). At just 250M params, it outperforms RoBERTa and ModernBERT on the MTEB benchmark.
๐ arxiv.org/abs/2502.19587
The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning
We achieve linear-complexity reasoning. Our "Delethink" decouples thought length from context, matching LongCoT performance with โ25% of the compute.
๐ arxiv.org/abs/2510.06557
The Expressive Limits of Diagonal SSMs for State-Tracking
We prove a tight bound: Diagonal SSMs are theoretically incapable of tracking non-Abelian groups. A critical look at where efficient models fail vs. where they succeed.
๐ openreview.net/forum?id=5bg...
Excited to share that we have 3 papers accepted at #ICLR2026! ๐ง๐ท
Our work this year focuses on efficiency and expressivity: deriving theoretical limits for SSMs, achieving linear scaling for reasoning, and modernizing encoder architectures.
A summary of our work ๐ ๐งต
At Chandar Lab, we are happy to announce the third edition of our assistance program to provide feedback for members of communities underrepresented in AI who want to apply to high-profile graduate programs. Want feedback? Details: chandar-lab.github.io/grad_app/. Deadline: Nov 01!
03.10.2025 15:20 โ ๐ 1 ๐ 1 ๐ฌ 0 ๐ 1Have a look at NovoMolGen implementation on our lab's HF page! Easy to work with and generate new molecules in no time.
08.09.2025 17:14 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0We just made NovoMolGen easy to play with: Transformers-native checkpoints on the Hub and small notebooks that let you load, sample, and fine-tune in minutes. The few lines of code that load the model, plug in a reward, run a short RL finetune, and plot the curve.
08.09.2025 16:07 โ ๐ 3 ๐ 2 ๐ฌ 1 ๐ 1Collaborative Multi Agent Reinforcement Learning is key for AI in the future. Check out R3D2, a generalist agent working on text-based Hanabi, accepted at ICLR 2025.
Website: chandar-lab.github.io/R3D2-A-Gener...
I am excited to share that our BindGPT paper won the best poster award at #AAAI2025! Congratulations to the team! Work led by @artemzholus.bsky.social!
05.03.2025 14:54 โ ๐ 9 ๐ 4 ๐ฌ 0 ๐ 0The best part? We are open-sourcing everything, including the intermediary model checkpoints. The main model is already on HuggingFace, be sure to check it out! (6/n)
Model: huggingface.co/chandar-lab/...
Paper: arxiv.org/abs/2502.19587
Code and checkpoints to be released soon!
NeoBERT is very strong compared to all baselines, is fully open source and open weights (including intermediary checkpoints)...and it has higher tokens/s throughput. Give it a try and substitute your favorite encoder with this new model!
28.02.2025 16:40 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0Great work by great colleagues! Have a look at the paper.
28.02.2025 16:38 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0Hi Rupali, could you please add me? Thanks a lot.
27.11.2024 16:48 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0