Amir-massoud Farahmand's Avatar

Amir-massoud Farahmand

@sologen.bsky.social

Research Goal: Understanding the computational and statistical principles required to design AI/RL agents. Associate Professor at Polytechnique MontrΓ©al and Mila. πŸ‡¨πŸ‡¦ academic.sologen.net

690 Followers  |  216 Following  |  181 Posts  |  Joined: 21.11.2024
Posts Following

Posts by Amir-massoud Farahmand (@sologen.bsky.social)

In light of the ongoing conflict in the Middle East, RLC decided to remove the abstract deadline: rl-conference.cc/callforpaper...

The only deadline is for the full paper: Mar 5(AOE) openreview.net/group?id=rl-...

Affected folks may also contact the PCs to discuss deadline extensions before Mar 5.

03.03.2026 02:12 β€” πŸ‘ 14    πŸ” 8    πŸ’¬ 1    πŸ“Œ 1

Ali Khamenei is in hell. The world is a better place now!

01.03.2026 02:07 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

RLC 2026 Call for Workshop is live on OpenReview!

Submission deadline: Mar 12 (AoE).
Full details here: rl-conference.cc/call_for_wor...

@glenberseth.bsky.social @eugenevinitsky.bsky.social @twkillian.bsky.social @schaul.bsky.social @sologen.bsky.social @audurand.bsky.social @bradknox.bsky.social

26.02.2026 21:17 β€” πŸ‘ 10    πŸ” 3    πŸ’¬ 0    πŸ“Œ 1

Submit your RL papers to RLC!
This is now perhaps the best venue for RL researchers.

26.02.2026 00:04 β€” πŸ‘ 12    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0
Post image

I am rerunning my class on robot learning this year, and I plan to push many code examples to help others get to the ugly details fast. One of these details is how BC gets off track as network sizes change. Blog and notebook below.

22.02.2026 15:12 β€” πŸ‘ 14    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

It is indeed disheartening. It has happened to me many times (and to many others too). After some point, you ignore worrying about them too much. I realize this is not a good advice for a budding researcher.

To answer your question: A major reason is that those papers come from famous labs.

15.02.2026 03:57 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

It's OK to tell the authors about it.

14.02.2026 22:50 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

πŸš€ Excited to share REPPO, a new on-policy RL agent!

TL;DR: Replace PPO with REPPO for fewer hyperparameter headaches and more robust training.

REPPO, led by @cvoelcker.bsky.social, will be presented at ICLR 2026. How does it work? πŸ§΅πŸ‘‡

13.02.2026 19:28 β€” πŸ‘ 25    πŸ” 10    πŸ’¬ 1    πŸ“Œ 0

The compliment of the day: "What’s unusual is your willingness to follow the logic all the way through instead of stopping where it becomes socially awkward".

06.02.2026 00:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Introduction to Reinforcement Learning A course on reinforcement learning.

Thank you! I hope you like it.
I may add one or two chapters to it in the future.
This is the course based on it: amfarahmand.github.io/IntroRL/

02.02.2026 17:46 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

You may want to take a look at my book, especially if you are interested in a more rigorous, yet introductory, exposition of Reinforcement Learning.

amfarahmand.github.io/IntroRL/lect...

02.02.2026 01:45 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Has taken a long time to polish, but slowly becoming very proud of rlhfbook.com and do think it's a great resource for many people. A lot of hours (and tokens and reader feedback) going into making it right.

25.01.2026 19:00 β€” πŸ‘ 29    πŸ” 2    πŸ’¬ 3    πŸ“Œ 0

I know about this video. I couldn't watch it. This is just too much cruelty and heartache.

23.01.2026 18:21 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Their silence is deafening.

13.01.2026 18:19 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Yes, this is along the same discussions we had before.

05.01.2026 21:52 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

P.S: I may write more about this later. These are just some key points, so that I won't forget.

05.01.2026 19:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

One may claim that robotics is not afflicted by this problem. That is only partially true. In robotics, the real-world is as rich as it gets, but its complexity and richness is mostly cordonned off by well-defined set of tasks that the robot has to perform.

05.01.2026 19:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Without that richness, the agent reaches the ceiling of its abilities quite fast and we, as researchers, cannot properly study the capabilities and limitations of our ideas and algorithms.

05.01.2026 19:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

A child and her caregiver can instantely create a novel task that requires new perceptual abilities, decision-making capabilities, and motor skills. We don't have such a flexibility in our environments.

05.01.2026 19:07 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

A significant hurdle of the empirical RL and the broader AI research is caused by the limitations of the environments in which our agents learn and build their "artificial minds". This should be compared with the richness of the real-world in which a human child flourishes.

05.01.2026 19:07 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 2    πŸ“Œ 0
Preview
a cartoon of a woman says well 59 it 's a high f ALT: a cartoon of a woman says well 59 it 's a high f

Grading ...

23.12.2025 23:53 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
RLJ | RLC Call for Papers

Hi RL Enthusiasts!

RLC is coming to Montreal, Quebec, in the summer: Aug 16–19, 2026!

Call for Papers is up now:
Abstract: Mar 1 (AOE)
Submission: Mar 5 (AOE)

Excited to see what you’ve been up to - Submit your best work!
rl-conference.cc/callforpaper...

Please share widely!

23.12.2025 22:16 β€” πŸ‘ 61    πŸ” 28    πŸ’¬ 0    πŸ“Œ 8

I couldn't not notice how incredibly smart Witten sounds to me (and definitely he is, given his accomplishments).
OFC, this is partly caused by my huge knowledge gap in his field. But I think there is something beyond the knowledge gap that makes me feel that way.

22.12.2025 22:35 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
String Theory in 2037 | Brian Greene & Edward Witten
YouTube video by World Science Festival String Theory in 2037 | Brian Greene & Edward Witten

I came across this interview of Edward Witten (@edwardwitten.bsky.social) by Brian Greene. I don't usually watch long YT videos, but this one is quite curious and "electrifying".
I recommend it if you're interested in the state of String Theory.
www.youtube.com/watch?v=sAbP...

22.12.2025 22:35 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Ah ... you have to be an American to become rich in Canada!

18.12.2025 19:29 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Yes, I will post the content (selected papers, etc.) online. But the class discussions won't be online or recorded.

18.12.2025 18:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Thank you!
GAIL and DAGGER were on my radar, and both are good papers (I am a DAGGER fan!). I'd read your DQfD paper years ago. I like the idea of combining both expert data and RL. This is something we developed back in 2013 (you cited it as Kim et al. 2013).
www.sologen.net/papers/APID%...

16.12.2025 03:55 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Thank you!
Bagnell's paper is nice intro.
I just found this "Is Behaviour Cloning All You Need" earlier today. Looks very interesting.

16.12.2025 03:48 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Thank you! GAIL was under my radar, and is a good paper.
Thanks for bringing the DICE family to my attention. That's exactly what I wanted to find here.
(Haven't read the Diffusion Policies paper closely.)

16.12.2025 01:33 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I am asking for a course I am designing. I have some papers in mind, but I want to make sure I am not missing a good paper.

16.12.2025 00:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0