If you are not familiar of the robot-learning/RL papers, and if you have the bandwidth to go through to them, I'm more than happy to share my source papers! Please let me know either ways!
28.06.2025 14:01 β π 0 π 0 π¬ 1 π 0@nagababa.bsky.social
Your friendly neighborhood roboticist! PhD student @cmurobotics.bsky.social Interested in Dexterous Manipulation, Democratization of Robots and Sensors, Sample Efficient RL, Soft Robotics, Causality, Multi-Agent Systems. servo97.github.io
If you are not familiar of the robot-learning/RL papers, and if you have the bandwidth to go through to them, I'm more than happy to share my source papers! Please let me know either ways!
28.06.2025 14:01 β π 0 π 0 π¬ 1 π 0And that some intervention is still needed for robot learning pipelines to demonstrate respectable ICL?
The 8th thread graph makes me very curious, and I'd love to hear your thoughts on this phenomenon!
i.e. converting robot joints into discrete cosine transforms seems to significantly improve sample efficiency and generalizability.
Why do we see this happen? Is that merely a ephemeral local minima that were stuck into?
However when SOTA papers use transformers for learning policies and Q functions in RL, the observation seems to be that "Distributional RL works better for offline RL tasks", or Frequency action space tokenization.
28.06.2025 14:01 β π 0 π 0 π¬ 0 π 0Hi Daniel, great post!
I'm curious what do you think about continuous control use-cases? From my very quick read over the post (not the paper itself) it seems that ICL emerges as a property of the model being able to handle diversity of continuous variable regression.
1/n
It was a dream come true to teach the course I wish existed at the start of my PhD. We built up the algorithmic foundations of modern-day RL, imitation learning, and RLHF, going deeper than the usual "grab bag of tricks". All 25 lectures + 150 pages of notes are now public!
20.06.2025 03:53 β π 47 π 10 π¬ 3 π 2I like scholar-inbox
www.scholar-inbox.com
I have a draft of my introduction to cooperative multi-agent reinforcement learning on arxiv. Check it out and let me know any feedback you have. The plan is to polish and extend the material into a more comprehensive text with Frans Oliehoek.
arxiv.org/abs/2405.06161
Store it on an old school magnetic HDD!
You only gotta power it on every decade or so to maintain the hardware.
RL as a refinement tool has been used in dexterous manipulation for some time!
It used to be quite hard to do tabula rasa learning for dexterous manipulation. And still is, for the most part!
The are lots of people who've influenced AI but haven't won Nobel prizes.
I discuss a tiny sliver of them in this parody of @billyjoelofficial.bsky.social 's "We didn't start the fire"...
Enjoy!
youtube.com/shorts/qDSYA...
Hard agree. Although how to reconcile that with when you're writing a paper?
Like what do you think is a reasonable process to pick "baselines"?
I've seen some students get jaded cos reviewers ask them to incl "baselines" with shitty git implementations :(
If you are into ML theory (RL or not) with a proven track record, and you are interested in an industry research position, PM me. Feel free to spread the word.
19.12.2024 00:55 β π 74 π 31 π¬ 3 π 0Hi! Can you please add me to the list?
Thanks for making it!
A common question nowadays: Which is better, diffusion or flow matching? π€
Our answer: Theyβre two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. Thatβs great: It means you can use them interchangeably.
πββοΈπ
30.11.2024 14:59 β π 1 π 0 π¬ 0 π 0Anne Gagneux, SΓ©golΓ¨ne Martin, @quentinbertrand.bsky.social Remi Emonet and I wrote a tutorial blog post on flow matching: dl.heeere.com/conditional-... with lots of illustrations and intuition!
We got this idea after their cool work on improving Plug and Play with FM: arxiv.org/abs/2410.02423
Intro Post
Hello World!
I'm a 2nd year Robotics PhD student at CMU, working on distributed dexterous manipulation, accessible soft robots and sensors, sample efficient robot learning, and causal inference.
Here are my cute robots:
PS: Videos are old and sped up. They move slower in real-world :3