Raphaël Avalos's Avatar

Raphaël Avalos

@raphael.avalos.fr

Fine-tuning LLMs @Cohere | PhD Candidate on RL @VUB

515 Followers  |  288 Following  |  11 Posts  |  Joined: 22.11.2024  |  1.7194

Latest posts by raphael.avalos.fr on Bluesky

Excited to share the technical report on Command R7B (7B) and Command A (111B), our flagship model! These models are the result of incredible teamwork at @cohere.com, and it was an honor to be part of it.

Report:
cohere.com/research/pap...

28.03.2025 14:46 — 👍 0    🔁 0    💬 0    📌 0
ALA 2025

🚨 Less than 48 hours left to submit to the 17th Adaptive Learning Agent workshop at @AAMASconf! 🚨
We welcome full papers, work in progress, and 2-page abstracts of recent journal papers. Don't miss the deadline!
🔗 More details: ala-workshop.github.io

24.02.2025 16:45 — 👍 2    🔁 1    💬 0    📌 0

Don't miss the opportunity to submit your (Multi-Agent) RL work to the ALA workshop!

28.01.2025 18:10 — 👍 2    🔁 0    💬 0    📌 0

The BlueSky account and website for the next edition of the ALA workshop is live! Follow it to get all the updates :)

09.01.2025 16:58 — 👍 0    🔁 0    💬 0    📌 0

I’m not sure about 1, but you could look into results on belief MDPs.
For 2, consider an environment with two rooms where the agent needs to press different buttons to get the optimal reward. If there's a cheap way to determine which room the agent is in, that would be the optimal policy :)

25.11.2024 23:00 — 👍 1    🔁 0    💬 1    📌 0

Bsky’s strength lies in being open-source and federated. This enables anyone to host servers, set moderation policies, create custom feeds, while avoiding incentives for allowing bots to survive. It’s a tough challenge, but there’s hope!

25.11.2024 22:47 — 👍 0    🔁 0    💬 0    📌 0

IMO, UCB favors exploration, not information-seeking, as it adds an exploration bonus rather than aiming to reduce state uncertainty. However, effective exploration can uncover policies where gathering information leads to better outcomes.
Hope that helps!

25.11.2024 21:25 — 👍 2    🔁 0    💬 1    📌 0

If you're an RL researcher or RL adjacent, pipe up to make sure I've added you here!
go.bsky.app/3WPHcHg

09.11.2024 16:42 — 👍 71    🔁 26    💬 52    📌 0

I am down !

25.11.2024 06:45 — 👍 0    🔁 0    💬 0    📌 0

I had lots of fun at the first edition, there were good talks and papers, and it was nice seeing old friends and making new ones! Highly recommend submitting and attending 😁

24.11.2024 19:14 — 👍 3    🔁 0    💬 0    📌 0

Just finished my first week at @cohere.com! Everyone has been so welcoming, and I’ve already learned so much. I’m really excited for the next weeks!

22.11.2024 23:17 — 👍 3    🔁 0    💬 0    📌 0

Hello, World! I'll mostly write about LLMs, RL, and random coding stuff. :)

22.11.2024 11:29 — 👍 5    🔁 0    💬 1    📌 0

Would love to be added!

22.11.2024 11:16 — 👍 0    🔁 0    💬 0    📌 0

@raphael.avalos.fr is following 20 prominent accounts