πPaper link: https://arxiv.org/abs/2510.00931
Led by: Ammar Khairi, @juliakreutzer.bsky.social, Daniel D'souza @mziizm.bsky.social
@cohereforai.bsky.social
@Cohere.com's non-profit research lab and open science initiative that seeks to solve complex machine learning problems. Join us in exploring the unknown, together. https://cohere.com/research
πPaper link: https://arxiv.org/abs/2510.00931
Led by: Ammar Khairi, @juliakreutzer.bsky.social, Daniel D'souza @mziizm.bsky.social
We are excited to present FusioN as a plug-and-play replacement to Best-of-N, shifting from a monolithic selection framework to collaborative synthesis one that embraces the diverse strengths of todayβs leading open LLMs.
02.10.2025 10:00 β π 0 π 0 π¬ 1 π 0How does FusioN use the same sample pool more effectively than BoN?
π§©While BoN picks just one sample per problem, FusioN synthesises one output from all samples β treating them as collaborators whose strengths can be integrated, not competitors in a zero-sum game.
Want the wisdom-of-the-crowd in 1 model?
π§βππ§π½βππ¨πΎβπFusion-of-N distills multiple teachers into richer synthetic data than BoN, training students that achieve bigger downstream gains, even surpassing teachers on multilingual factual reasoning π
Test-time scaling doesn't need to waste samples, Fusion-of-N turns every sample into signal; outperforming BoN across tasks, languages and models. π
Fusion-of-N boosts CommandA win-rates vs Gemini-2.5 Pro +8.3% across 11 languages β a +4.0% improvement over BoN π₯
Fusion-of-N uses an LLM (the fusor) to merge multiple candidate answers into one π
Instead of selecting only one response, Fusion-of-N creates an even better answer by integrating insights across all samples π
Is Best-of-N really the best use of your inference compute?
Introducing Fusion-of-N: a simple and powerful way to advance inference and distillation beyond Best-of-N.
Apply now: https://jobs.ashbyhq.com/cohere/7ec9eaf4-8cfc-4977-9041-86f73e7ab10b
30.09.2025 10:00 β π 1 π 0 π¬ 0 π 0Weβre not your average lab. Weβre a hybrid research environment dedicated to revolutionizing the ML space.
And weβre hiring a Senior Research Scientist to co-create with us.
If you believe in research as a shared, global effort β this is your chance.
Led by: Srishti Gureja, Elena Tommasone, Jingyi He, @sarahooker.bsky.social, Matthias Galle, and @mziizm.bsky.social
π Paper: https://arxiv.org/abs/2509.20837
πΉ The future of synthetic training hinges on rethinking verification. Itβs calibrated verification: complex, diverse test suites combined with flexible signals that break the Verification Ceiling and improve code LLMs.
29.09.2025 10:00 β π 1 π 0 π¬ 1 π 0πΉ We also find that LLMs can serve as soft verifiers. Their judgments recover useful data and often match or surpass formal unit tests selection.
29.09.2025 10:00 β π 0 π 0 π¬ 1 π 0πΉ Relaxing verification thresholds boosts performance but only with sufficiently complex test suites. Correctness still matters, but how we define it is the real issue.
29.09.2025 10:00 β π 0 π 0 π¬ 1 π 0We find:
πΉ Rigid verification risks biasing toward easy problems, while richer correctness signals preserve both quality and diversity.
What if the way we verify synthetic code is limiting model performance?
In our latest work we uncover the Verification Ceiling Problem: strict βall tests must passβ rules throw away useful data, while weak tests let errors through.
I'm excited to share that I'll be stepping into the role of Head of @cohereforai.bsky.social. It's an honor and a responsibility to lead such an extraordinary group of researchers pushing the boundaries of AI research.
05.09.2025 17:26 β π 11 π 2 π¬ 1 π 0Papers In The Park 14. Last one of the season! Still great weather. Surprising. Anthony is presenting the βWhy Language Models Hallucinateβ.
Thanks to @cohereforai.bsky.social for the copies and pizza.
π¨ Rare opportunity: Cohere Labs is hiring a Research Scientist!
If youβre passionate about studying fundamental AI problems and working in a globally collaborative, open-science environment, this is for you.
Apply here: jobs.ashbyhq.com/cohere/7ec9e...
Itβs papers in the park 7! Thanks to @cohereforai.bsky.social for the papers and the pizza, and to Alvin and Anthony for organizing.
Itβs easily one of funnest paper reads in the city!
Breaking into AI research is harder than ever, and early-career researchers face fewer chances to get started.
Entry points matter.
We started the Scholars Program 3 years ago to give new researchers a real shot β excited to open applications for year 4β¨
Check out the full blogpost here: https://cohere.com/blog/elo-ratings-beyond-arena-style-evaluations
Great to collaborate with Adithya Venkatadri Hulagadri, @mziizm.bsky.socialβ¬, @jiangangngui.bsky.socialβ¬, and @juliakreutzer.bsky.socialβ¬ on this exploration.
In this blogpost we propose a 3rd path:
β
Balanced sampling across languages/tasks
β
Offline pseudo-pairwise comparisons (Bradley-Terry)
β
Confidence intervals & transparent breakdowns
The result? Rankings that better reflect real model utility.
From circular wins/losses across skills, to tie-handling pitfalls, to prompt-spamming in arenas, Elo struggles when competition isnβt a single, binary game.
We show how multilingual, multi-task evaluation breaks its core assumptions.
While effective for chessβοΈ, Elo ratings struggle with LLM evaluation due to volatility and transitivity issues.
New post in collaboration with AI Singapore explores why Elo falls short for AI leaderboards and how we can do better.
Still have questions about the Scholars Program? Join our information session on August 15th at 11am ET to get all the answers you need!
Register now - https://tinyurl.com/CohereLabsScholarsInfo
Accepted scholars will join our worldclass research team from Jan to Aug 2026. This full-time, paid opportunity reflects the program's intensity and dedication, setting it apart from other labs.
Apply here: https://jobs.ashbyhq.com/cohere/a77c6864-5a43-44c1-81dc-a66e23bdd9a6
Scholars will gain access to a robust experimental framework, empowering them to contribute to our ongoing commitment to responsible, fundamental research in machine learning. π₯ This is your chance to make a real impact and change the course of ML research.
13.08.2025 13:32 β π 1 π 0 π¬ 1 π 0The Scholars Program offers a unique, full-time opportunity to work alongside leading researchers in ML. π
Our mission is to identify and nurture emerging talent from across the globe, driving innovative research that pushes the boundaries of AI. π
cohere.com/research/scholars-program
Applications are now open for the next cohort of the Cohere Labs Scholars Program! π
This is your chance to collaborate with some of the brightest minds in AI & chart new courses in ML research. Let's change the spaces breakthroughs happen.
Apply by Aug 29.
βWhen Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMsβ
Led by: Ammar Khairi, Daniel D'souza, Ye Shen, Julia Kreutzer, Sara Hooker
πPaper link: arxiv.org/abs/2506.20544