Bryan Chan's Avatar

Bryan Chan

@chanpyb.bsky.social

PhD student at University of Alberta. Interested in reinforcement learning, imitation learning, machine learning theory, and robotics https://chanb.github.io/

1,095 Followers  |  131 Following  |  11 Posts  |  Joined: 10.11.2024  |  1.5816

Latest posts by chanpyb.bsky.social on Bluesky

If you were at the last RLC NeurIPS event, you know it's not to be missed. Invite any RL researcher you know, no invite-only parties here.

10.12.2024 21:56 β€” πŸ‘ 18    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Robotics manipulation to be specific :)

02.12.2024 15:29 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Can someone point me to any paper that uses RL on real-life (image-based) environments without sim2real/imitation learning? For good reasons I am told that this is pretty common but I’ve only found a handful of papers (CQN, QT-Opt, SAC-X)

02.12.2024 15:26 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Hi Csaba :)

30.11.2024 04:30 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Streaming Deep Reinforcement Learning Finally Works Natural intelligence processes experience as a continuous stream, sensing, acting, and learning moment-by-moment in real time. Streaming learning, the modus operandi of classic reinforcement learning ...

Streaming Deep Reinforcement Learning Finally Works, by
M. Elsayed, G. Vasan, A. R. Mahmood, is one of those papers I wish I had written πŸ˜…

This paper seems to allow us to do RL with NNs as it should have always been done. Everyone should read it!

arxiv.org/abs/2410.14606

27.11.2024 23:09 β€” πŸ‘ 92    πŸ” 20    πŸ’¬ 2    πŸ“Œ 0

I also think many robot learning papers are overclaiming what they can do… the paper task setup can be very easy in comparison to real production systems even for pick-n-place… but it’s hard to see this difference in the presented videos (if any)

27.11.2024 14:27 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I feel like many works do have experiments on sim but they don’t seem to transfer to real life (not in terms of sim2real but applying the same algo). I wonder how much of it comes from delays or us being overprotective of the robot in real life. Maybe evals need to include these components.

27.11.2024 14:23 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

I thought that was just me πŸ˜… was trying it on an uncluttered single item picking task

27.11.2024 13:23 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Perhaps it’s β€œnecessary” to have it as a baseline (arguably TD3 is just as important imo but SAC seems to be more commonly used), and it’s hard to convince people which one is stronger than SAC. I think recently there are few e.g. ACE at Neurips. Generally feels like a popularity game to me

23.11.2024 17:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Let me try, we’ve been very quiet historically πŸ˜‚

12.11.2024 14:58 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Can I get added please :)

12.11.2024 14:16 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Please see me :)

12.11.2024 14:15 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Please add me, I’ve been doing robot learning with RL/IL!

12.11.2024 14:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@chanpyb is following 20 prominent accounts