Check out our new work on learning to drive in new cities without collecting any new demonstrations in that city. All we need is a simple reward function and readily available map and meta-information about the city. Self-play reinforcement learning does the rest!
21.02.2026 08:19 β
π 8
π 1
π¬ 0
π 0
I was interviewed for the Robot Talk Podcast.
06.12.2025 09:18 β
π 5
π 1
π¬ 0
π 0
Check out our new work on using low-rank perturbations to make evolution strategies work for billion-parameter models.
22.11.2025 09:01 β
π 9
π 2
π¬ 0
π 0
The robotics revolution is here.
05.11.2025 14:14 β
π 4
π 1
π¬ 0
π 0
Waymo is coming to London next year.
15.10.2025 09:57 β
π 9
π 4
π¬ 2
π 0
A photo of Shimon wearing a blue t-shirt.
Got questions about self-driving cars? π€ π
Later this season, I'll be chatting to Prof. Shimon Whiteson (@shimon8282.bsky.social) from @ox.ac.uk and @waymo.bsky.social about machine learning for autonomous vehicles.
Send me your questions for Shimon in the comments below! #Robots #Robotics #AI
22.09.2025 14:28 β
π 6
π 3
π¬ 1
π 1
What happened?!?
28.07.2025 10:26 β
π 3
π 0
π¬ 0
π 0
Rutger Bregman Wants to Save Elites From Their Wasted Lives
"In the fight against injustice, winning is a moral duty." www.nytimes.com/2025/05/17/m...
19.05.2025 20:41 β
π 1
π 0
π¬ 0
π 0
now publishers - A Tutorial on Meta-Reinforcement Learning
Publishers of Foundations and Trends, making research accessible
Our survey on meta reinforcement learning has now been published by Foundations and Trends in Machine Learning: nowpublishers.com/article/Deta...
18.04.2025 15:19 β
π 12
π 1
π¬ 0
π 0
Just another day at the office.
16.04.2025 19:29 β
π 11
π 0
π¬ 0
π 0
In order to set a good example for my students, I feel morally obliged to do as little work as possible.
13.01.2025 14:42 β
π 17
π 2
π¬ 0
π 0
I continue to maintain that that is an insane amount of butter. I use 50g and it is already plenty rich.
11.01.2025 22:06 β
π 2
π 0
π¬ 1
π 0
Apparently I have not been doing my job?
23.12.2024 08:51 β
π 2
π 0
π¬ 1
π 0
Thanks to Frans Oliehoek and Chris Amato for pointing out these issues. Thanks also to Frans Oliehoek and Andrea Baisero for feedback on the revised proof.
18.12.2024 11:50 β
π 2
π 0
π¬ 0
π 0
This version corrects an error in the original convergence proof arising from the fact that the critic depended on the state but not the joint history.
18.12.2024 11:50 β
π 3
π 0
π¬ 1
π 0
The position is London based but the students can come from anywhere.
15.12.2024 11:27 β
π 0
π 0
π¬ 0
π 0
What if instead of viewing AI as a separate entity, we treat the human and computer as one cognitive unit that together outperforms either on its own? My prefrontal cortex doesn't fret about delegating visual processing to my visual cortex. I don't fret about delegating navigation to Google Maps.
13.12.2024 07:25 β
π 4
π 1
π¬ 1
π 0
Too late youβre cancelled.
09.12.2024 21:56 β
π 1
π 0
π¬ 0
π 0
Who says Mountain Car is not a real world problem?
08.12.2024 22:11 β
π 26
π 1
π¬ 1
π 0
pbs.twimg.com/media/GGtsDH...
12.11.2024 16:24 β
π 30
π 5
π¬ 1
π 1