Excited to share the technical report on Command R7B (7B) and Command A (111B), our flagship model! These models are the result of incredible teamwork at @cohere.com, and it was an honor to be part of it.
Report:
cohere.com/research/pap...
28.03.2025 14:46 — 👍 0 🔁 0 💬 0 📌 0
ALA 2025
🚨 Less than 48 hours left to submit to the 17th Adaptive Learning Agent workshop at @AAMASconf! 🚨
We welcome full papers, work in progress, and 2-page abstracts of recent journal papers. Don't miss the deadline!
🔗 More details: ala-workshop.github.io
24.02.2025 16:45 — 👍 2 🔁 1 💬 0 📌 0
Don't miss the opportunity to submit your (Multi-Agent) RL work to the ALA workshop!
28.01.2025 18:10 — 👍 2 🔁 0 💬 0 📌 0
The BlueSky account and website for the next edition of the ALA workshop is live! Follow it to get all the updates :)
09.01.2025 16:58 — 👍 0 🔁 0 💬 0 📌 0
I’m not sure about 1, but you could look into results on belief MDPs.
For 2, consider an environment with two rooms where the agent needs to press different buttons to get the optimal reward. If there's a cheap way to determine which room the agent is in, that would be the optimal policy :)
25.11.2024 23:00 — 👍 1 🔁 0 💬 1 📌 0
Bsky’s strength lies in being open-source and federated. This enables anyone to host servers, set moderation policies, create custom feeds, while avoiding incentives for allowing bots to survive. It’s a tough challenge, but there’s hope!
25.11.2024 22:47 — 👍 0 🔁 0 💬 0 📌 0
IMO, UCB favors exploration, not information-seeking, as it adds an exploration bonus rather than aiming to reduce state uncertainty. However, effective exploration can uncover policies where gathering information leads to better outcomes.
Hope that helps!
25.11.2024 21:25 — 👍 2 🔁 0 💬 1 📌 0
If you're an RL researcher or RL adjacent, pipe up to make sure I've added you here!
go.bsky.app/3WPHcHg
09.11.2024 16:42 — 👍 71 🔁 26 💬 52 📌 0
I am down !
25.11.2024 06:45 — 👍 0 🔁 0 💬 0 📌 0
I had lots of fun at the first edition, there were good talks and papers, and it was nice seeing old friends and making new ones! Highly recommend submitting and attending 😁
24.11.2024 19:14 — 👍 3 🔁 0 💬 0 📌 0
Just finished my first week at @cohere.com! Everyone has been so welcoming, and I’ve already learned so much. I’m really excited for the next weeks!
22.11.2024 23:17 — 👍 3 🔁 0 💬 0 📌 0
Hello, World! I'll mostly write about LLMs, RL, and random coding stuff. :)
22.11.2024 11:29 — 👍 5 🔁 0 💬 1 📌 0
Would love to be added!
22.11.2024 11:16 — 👍 0 🔁 0 💬 0 📌 0
Building robust LLMs @Cohere
Breakthrough AI to solve the world's biggest problems.
› Join us: http://allenai.org/careers
› Get our newsletter: https://share.hsforms.com/1uJkWs5aDRHWhiky3aHooIg3ioxm
« Le Monde » est un journal français fondé par Hubert Beuve-Méry en 1944. C'est aussi un compte Bluesky qui poste l'actualité en continu de façon automatisée.
AI Researcher @ NNAISENSE. (Co)developed Highway Networks, Upside-Down RL, Bayesian Flow Networks, EvoTorch
📜 Learning is compression
https://rupeshks.cc/
AI. RL. Robots+Humans. Building general purpose agents.
Research Scientist in the Gaming and Interactive Agents Group at Sony AI.
Prev: MSFT Research, UT Austin, CMU, IIT Jodhpur.
https://scholar.google.com/citations?user=zZhWSQ0AAAAJ&hl=en
Researcher @ Google DeepMind and Honorary Fellow @ U of Edinburgh.
RL, philosophy, foundations, AI.
https://david-abel.github.io
CS assistant prof @Utah. Researches human-robot interaction, human-in-the-loop ML, AI safety and alignment. https://users.cs.utah.edu/~dsbrown/
Large-Scale Robot Decision Making @GoogleDeepMind European @ELLISforEurope - imitation interaction transfer - priors: @oxfordrobots @berkeley_ai @ETH @MIT
AI and robotics researcher at Technion
avivt.github.io/avivt/
Research Scientist @ Google DeepMind, in open-ended learning, and AI for Scientific Discovery.
Machine Teacher. Research Scientist at Phaidra. PhD from TU Delft. Previously JP Morgan, Huawei, Unity.
https://www.suau.io/
AI & Transportation | MIT Associate Professor
Interests: AI for good, sociotechnical systems, machine learning, optimization, reinforcement learning, public policy, gov tech, open science.
Science is messy and beautiful.
http://www.wucathy.com
Ph.D. Student studying AI & decision making at Mila / McGill University. Currently at FAIR @ Meta. Previously Google DeepMind & Google Brain.
https://brosa.ca
Assistant Professor at the University of Alberta. Amii Fellow, Canada CIFAR AI chair. Machine learning researcher. All things reinforcement learning.
📍 Edmonton, Canada 🇨🇦
🔗 https://webdocs.cs.ualberta.ca/~machado/
🗓️ Joined November, 2024
Faculty at the University of Pennsylvania. Lifelong machine learning for robotics and precision medicine: continual learning, transfer & multi-task learning, deep RL, multimodal ML, and human-AI collaboration. seas.upenn.edu/~eeaton
Assistant Professor @ Princeton ECE
Safe Human-Centered Robotics and AI
⛷️ ML Theorist carving equations and mountain trails | 🚴♂️ Biker, Climber, Adventurer | 🧠 Reinforcement Learning: Always seeking higher peaks, steeper walls and better policies.
https://ualberta.ca/~szepesva
Associate professor @ Université Laval - IID - Mila
Interested in reinforcement learning, bandits, partial monitoring, active learning, ... anything that learns by getting its own data from the environment!
Research @ OpenAI, Prev PhD at Oxford University