If youβre an NYU student, come learn about this wonderful opportunity to collaborate with us at FAIR events.atmeta.com/metanyuaimen... Panel is tomorrow 10am at NYU Center for Data Science.
16.10.2025 14:45 β π 0 π 0 π¬ 1 π 0@markibrahim.bsky.social
Researching the dark arts of deep learning at Meta's FAIR (Fundamental AI Research) Lab https://markibrahim.me/
If youβre an NYU student, come learn about this wonderful opportunity to collaborate with us at FAIR events.atmeta.com/metanyuaimen... Panel is tomorrow 10am at NYU Center for Data Science.
16.10.2025 14:45 β π 0 π 0 π¬ 1 π 0We explain how good delimiters steer attention heads to key input tokens and offer practical recommendations for prompts and delimiter choices to get the best performance from your LLMβtldr; use β!β or β\nβ.
09.10.2025 14:31 β π 2 π 0 π¬ 0 π 0- MMLU performance can vary by +/- 23% depending on the choice of delimiter across leading open model families (Llama, Qwen, and Gemma). 
- Closed models, GPT-4o, are also brittle to the choice of delimiter. 
π§΅
One can manipulate LLM rankings to put any model in the leadβonly by modifying the single character separating demonstration examples. Learn more in our new paper arxiv.org/abs/2510.05152 
w/ Jingtong Su, Jianyu Zhang, @karen-ullrich.bsky.social , and LΓ©on Bottou.
π§΅
Open-weights for our Llip multimodal vision-language model led by @lavoiems.bsky.social are public!
LLIP proposes new pre-training objective to capture the many ways to describe an image leading to strong performance across a suite of 22-zero shot benchmarks.
bsky.app/profile/lavo...
We also find better models are not necessarily better at abstention, suggesting the skill of abstention is an open-research question.
w/ @polkirichenko.bsky.social Sam Bell Kamalika Chaudhuri
Paper: arxiv.org/abs/2506.09038
Code: github.com/facebookrese...
bsky.app/profile/polk...
π§΅2/2
A good language model should say βI donβt knowβ by reasoning about the limits of its knowledge. Our new work AbstentionBench carefully measures this overlooked skill in an open-codebase others can build on!
We find frontier reasoning degrades modelsβ ability to know when NOT to answer.
π§΅1/2
Join us as a PhD research intern at FAIR w/ @polkirichenko.bsky.social and Kamalika Chaudhuri
 
to start this summer or fall with a focus on open science into multimodal models, agents and beyond! Email polkirichenko@meta.com with the title [Prospective Intern 2025] and attach your CV if interested!
We found MLM-U training can even outperform transformers trained with additional supervision from A* search traces, showing the promise of alternative learning objectives.
Learn more on our site and code at facebookresearch.github.io/maze_navigat...
Recently, we also applied the same MLM-U objective to maze navigation. We find when training parameter-matched transformers on identical data, MLM-U without any tweaks outperforms standard next token training across all maze grid sizes (up to 30x30).
11.12.2024 18:42 β π 1 π 0 π¬ 1 π 0We find MLM-U training improves knowledge retrieval on Wikipedia-based questions and even outperforms a pretrained 7B Mistral model with a much smaller 100M parameter transformer trained from scratch!
Come by our NeurIPS poster Exhibit Halls A-C #3204 11am PST Thursday to learn more.
We show training with a factorization agnostic objective, MLM-U (a variable ratio BERT-style loss with links to discrete diffusion), that predicts multiple tokens ahead and back can significantly mitigate the reversal curse!
11.12.2024 18:36 β π 0 π 0 π¬ 1 π 0Problem: Language models struggle with the βreversal curse:β an inability to answer reformulations of a question. We show this stems from the standard next token learning objective in what we call βthe factorization curse.β
11.12.2024 18:36 β π 0 π 0 π¬ 1 π 0Can we boost transformersβ ability to retrieve knowledge and plan in maze navigation by only tweaking the learning objective? 
We emphatically say YES in our #NeurIPS 2024 study! π§΅
w/ Ouail Kitouni, Niklas Nolte, Diane Bouchacourt, Adina Williams, and Mike Rabbat
Paper arxiv.org/abs/2406.05183