Antonin Raffin's Avatar

Antonin Raffin

@araffin.bsky.social

Researcher in robotics and machine learning (Reinforcement Learning). Maintainer of Stable-Baselines (SB3). https://araffin.github.io/

3,283 Followers  |  244 Following  |  95 Posts  |  Joined: 08.02.2024  |  2.0611

Latest posts by araffin.bsky.social on Bluesky

Preview
2026-04-27-compositionality by EricElmoznino Β· Pull Request #16 Β· iclr-blogposts/2026 OpenReview Submission Thread Checklist before opening a PR [x ] I am opening a pull request against the main branch of the 2026 repo. [ x] My post and all associated references to it are all l...

Submission is open but submission system doesn't seem to work yet... (no public url generated for github.com/iclr-blogpos... for instance :/)

Minor: in the open review submission, there is an additinonal dot in the example url `https://d2jud02ci9yv69.cloudfront.net./[YOUR_SUBMISSION]/`

06.11.2025 08:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Ronny Chieng Meets Neo, the World’s Stupidest Robot Maid | The Daily Show
YouTube video by The Daily Show Ronny Chieng Meets Neo, the World’s Stupidest Robot Maid | The Daily Show

Wow. The backlash to the 1X Neo announcement has been widespread and *merciless*.

This may be a warning to lots of humanoids companies. All your promises don’t matter to the public if your robot looks or acts dumb.

youtu.be/b_SNExtznd4?...

31.10.2025 12:34 β€” πŸ‘ 15    πŸ” 1    πŸ’¬ 2    πŸ“Œ 2

Why self-taught engineers often outperform

michaelbastos.com/blog/why-sel...

#programming #softwaredevelopment #tech #blog

29.10.2025 19:34 β€” πŸ‘ 8    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
Post image

🚨The Formalism-Implementation Gap in RL research🚨

Lots of progress in RL research over last 10 years, but too much performance-driven => overfitting to benchmarks (like the ALE).

1⃣ Let's advance science of RL
2⃣ Let's be explicit about how benchmarks map to formalism

1/X

28.10.2025 13:55 β€” πŸ‘ 43    πŸ” 5    πŸ’¬ 1    πŸ“Œ 2
Preview
Pixi: Modern package management for Robotics Developing Robots is hard; Pixi makes it easier by creating reproducible, cross-platform ROS development environments without Docker or Ubuntu lock-in.

🚨 New blog post alert!

Modern package management for Robotics with Pixi!

prefix.dev/blog/reprod...

#ROS #ROSCon #ROSCon2025

24.10.2025 15:34 β€” πŸ‘ 4    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Three panel thing. In the left panel we use error bars. In the second, we take statistical significance as the biggest number but still have error bars. In LLM science, we just have the biggest number

Three panel thing. In the left panel we use error bars. In the second, we take statistical significance as the biggest number but still have error bars. In LLM science, we just have the biggest number

What if we did a single run and declared victory

23.10.2025 02:28 β€” πŸ‘ 339    πŸ” 70    πŸ’¬ 13    πŸ“Œ 9
a spurious correlation example

a spurious correlation example

A wonderful collection of spurious correlations, correlation is not causation.

link: www.tylervigen.com/spurious-cor...

found via @stefanjudis.com newsletter

21.10.2025 05:44 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

A good video on software refactoring and redesign (about the Audacity audio editing program)

20.10.2025 10:20 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
CORL 2025
YouTube video by Conference on Robot Learning CORL 2025

Video recordings of CORL 2025 talks now available! Many interesting orals / keynotes / sponsor talks / early-career talks / poster spotlights.
Day 1: www.youtube.com/watch?v=Use5...
Day 2: www.youtube.com/watch?v=rh2o...
Day 3: www.youtube.com/watch?v=9lzF...

17.10.2025 05:31 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Cross compiling in the Conda ecosystem Cross compiling is a fundamental capability in modern software development, allowing developers to build packages for different architectures without needing access to the target hardware.

In our little deep dive series we're now exploring how cross-compilation in the Conda ecosystem works: prefix.dev/blog/cross-c.... Back in the days, @conda-forge.org rolled this out widely to support osx-arm64 early on, and now for linux-aarch64/ppc64le.

15.10.2025 06:11 β€” πŸ‘ 5    πŸ” 6    πŸ’¬ 0    πŸ“Œ 0
Preview
How to catch subtle RL bugs before they catch you Tools and habits for reliable, fast RL experimentation and development

Rapid RL experimentation is great. But how do you catch silent errors before they slip by?

In this post, I share tools and habits that help me move quickly from idea to result without sacrificing reliability.

13.10.2025 11:29 β€” πŸ‘ 41    πŸ” 5    πŸ’¬ 0    πŸ“Œ 1
But what is a Laplace Transform?
YouTube video by 3Blue1Brown But what is a Laplace Transform?

Ever since I made a video about Fourier Transforms, one of the most requested topics on the channel has been its close cousin, the Laplace Transform.

I've been having a lot of fun animating a mini-series about this topic, and the main part is now out.

youtu.be/j0wJBEZdwLs

12.10.2025 12:49 β€” πŸ‘ 414    πŸ” 66    πŸ’¬ 11    πŸ“Œ 5
The Big LLM Architecture Comparison
YouTube video by Sebastian Raschka The Big LLM Architecture Comparison

Updated & turned my Big LLM Architecture Comparison article into a video lecture.

The 11 LLM archs covered in this video:
1. DeepSeek V3/R1
2. OLMo 2
3. Gemma 3
4. Mistral Small 3.1
5. Llama 4
6. Qwen3
7. SmolLM3
8. Kimi 2
9. GPT-OSS
10. Grok 2.5
11. GLM-4.5/4.6

www.youtube.com/watch?v=rNlU...

10.10.2025 17:05 β€” πŸ‘ 52    πŸ” 9    πŸ’¬ 0    πŸ“Œ 1
GitHub - mujocolab/mjlab: Isaac Lab API, powered by MuJoCo-Warp, for RL and robotics research. Isaac Lab API, powered by MuJoCo-Warp, for RL and robotics research. - mujocolab/mjlab

Mjlab

Isaac Lab API, powered by MuJoCo-Warp, for RL and robotics research.

github.com/mujocolab/mj...

10.10.2025 09:35 β€” πŸ‘ 8    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Preview
GitHub - araffin/sbx: SBX: Stable Baselines Jax (SB3 + Jax) RL algorithms SBX: Stable Baselines Jax (SB3 + Jax) RL algorithms - araffin/sbx

SBX (SB3 Jax) v0.23.0 is out =)!

I added CNN support for PPO.
It turns out that using a shared features extractor (CNN in this case) is important for achieving good performance on Atari games.

Perf report: wandb.ai/openrlbenchm...

github.com/araffin/sbx

29.09.2025 17:23 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
a robot arm support a robot humanoid on a treadmill

a robot arm support a robot humanoid on a treadmill

Training a small humanoid robot with reinforcement learning using another robot for reset.

by Kaizhe Hu et al. (ToddlerBot Stanford)

Project page: robot-trains-robot.github.io

29.09.2025 08:48 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Open-Source Hardware in the Era of Robot Learning Workshop @ CoRL 2025

Website: open-hardware-robots.github.io/CoRL2025/

27.09.2025 06:19 β€” πŸ‘ 15    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

The CoRL 2025 workshop on Open-Source Hardware in the Era of Robot Learning is starting now! You can join the conversation online via live streaming: https://www.youtube.com/live/ZVPIJzF1df4

27.09.2025 00:32 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 1

πŸ“£ Call for Blog Posts at #ICLR2026 @iclr_conf

Following the success of the past iterations, we are opening the Call for Blog Posts 2026!

iclr-blogposts.github.io/2026/about/#...

Please retweet!

22.09.2025 07:44 β€” πŸ‘ 14    πŸ” 8    πŸ’¬ 1    πŸ“Œ 1

A practical introduction to (deep) RL, providing intuitions to understand the more recent algorithms.

The plan is to start from tabular Q-learning and work our way up to Deep Q-learning (DQN). In a following post, I will continue on to Soft Actor-Critic (SAC) and its extensions.

22.09.2025 08:06 β€” πŸ‘ 18    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
The Open Duck Mini open-source and open-hardware robot.

The Open Duck Mini open-source and open-hardware robot.

Next Saturday, π—”π—»π˜π—Όπ—Άπ—»π—² 𝗣𝗢𝗿𝗿𝗼𝗻𝗲 will present Pollen Robotics & Hugging Face's open-source robots, including Reachy Mini, the SO-100 arm, the Amazing Hand and the Open Duck Mini. He will discuss the sim2real challenges of making the Open Duck Mini walk, and how […]

[Original post on fosstodon.org]

21.09.2025 12:23 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - araffin/rlss23-dqn-tutorial: Deep Q-Network (DQN) and Fitted Q-Iteration (FQI) tutorial for RL Summer School 2023 Deep Q-Network (DQN) and Fitted Q-Iteration (FQI) tutorial for RL Summer School 2023 - araffin/rlss23-dqn-tutorial

Code and colab notebooks: github.com/araffin/rlss...

18.09.2025 15:09 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
RL102: From Tabular Q-Learning to Deep Q-Learning (DQN) | Antonin Raffin | Homepage This blog post is meant to be a practical introduction to (deep) reinforcement learning1, presenting the main concepts and providing intuitions to understand the more recent Deep RL algorithms. For a ...

RL102: From Tabular Q-Learning to Deep Q-Learning (DQN) - A Practical Introduction to (Deep) Reinforcement Learning

araffin.github.io/post/rl102/

18.09.2025 15:09 β€” πŸ‘ 13    πŸ” 2    πŸ’¬ 1    πŸ“Œ 1
Preview
Build C++ projects with Pixi Painless dependency management (including shared libraries), monorepos and CI/CD is here for C++/CMake projects with Pixi.

Package building with Pixi is being rolled out! Dive into our latest blog post on crafting C++ packages.

And guess what? It’s not just for C++; Pixi plays nice with Python, Rust, ROS, Mojo, and beyond!

prefix.dev/blog/pixi-b...

05.09.2025 10:00 β€” πŸ‘ 15    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0
A comic about computing. A transcript may be available at the link in the post.

A comic about computing. A transcript may be available at the link in the post.

bash tricks

permalink: wizardzines.com/comics/bash-...
from our zine "Bite Size Command Line": wizardzines.com/zines/bite-s...

03.09.2025 19:24 β€” πŸ‘ 19    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0

This is absolutely true -- this is a superb and much-needed consolidation of so much of modern RL. Kevin, inquiring minds want to understand the process you use to put this artwork together! @sirbayes.bsky.social Perhaps this is also the ultimate benchmark for Gemini Deep Research reports. ;-p

03.09.2025 04:41 β€” πŸ‘ 10    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Let’s write a search engine, part 1 of 2

Weekend project: building a (site) search engine www.redblobgames.com/blog/2025-08... just for fun! :)

01.09.2025 02:48 β€” πŸ‘ 28    πŸ” 4    πŸ’¬ 0    πŸ“Œ 1

NeurIPS has decided to do what ICLR did: As a SAC I received the message πŸ‘‡ This is wrong! If the review process cannot handle so many papers, the conference needs yo split instead of arbitrarily rejecting 400 papers.

28.08.2025 16:12 β€” πŸ‘ 106    πŸ” 17    πŸ’¬ 8    πŸ“Œ 2
Post image

Where do some of Reinforcement Learning's great thinkers stand today?

Find out! Keynotes of the RL Conference are online:
www.youtube.com/playlist?lis...

Wanting vs liking, Agent factories, Theoretical limit of LLMs, Pluralist value, RL teachers, Knowledge flywheels
(guess who talked about which!)

27.08.2025 12:46 β€” πŸ‘ 76    πŸ” 23    πŸ’¬ 1    πŸ“Œ 1
How astronauts control robots from space
YouTube video by European Space Agency, ESA How astronauts control robots from space

How astronauts control robots from space

(featuring our quadruped Bert πŸ‘€)

youtu.be/BMFPVCu16SQ

19.08.2025 21:33 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@araffin is following 20 prominent accounts