Tom Schaul's Avatar

Tom Schaul

@schaul.bsky.social

RL researcher at DeepMind https://schaul.site44.com/ πŸ‡±πŸ‡Ί

3,218 Followers  |  293 Following  |  28 Posts  |  Joined: 13.11.2024  |  2.4021

Latest posts by schaul.bsky.social on Bluesky

@rl-conference.bsky.social

27.08.2025 12:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Where do some of Reinforcement Learning's great thinkers stand today?

Find out! Keynotes of the RL Conference are online:
www.youtube.com/playlist?lis...

Wanting vs liking, Agent factories, Theoretical limit of LLMs, Pluralist value, RL teachers, Knowledge flywheels
(guess who talked about which!)

27.08.2025 12:46 β€” πŸ‘ 75    πŸ” 23    πŸ’¬ 1    πŸ“Œ 1

On my way to #ICML2025 to present our algorithm that strongly scales with inference compute, in both performance and sample diversity! πŸš€

Reach out if you’d like to chat more!

13.07.2025 12:26 β€” πŸ‘ 8    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Deadline to apply is this Wednesday!

02.06.2025 09:40 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

The RL team is a small team led by David Silver. We build RL algorithms and solve ambitious research challenges. As one of DeepMind's oldest teams, it has been instrumental in building DQN, AlphaGo, Rainbow, AlphaZero, MuZero, AlphaStar, AlphaProof, Gemini, etc. Help us build the next big thing!

24.05.2025 10:08 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Research Engineer, Reinforcement Learning London, UK

Ever thought of joining DeepMind's RL team? We're recruiting for a research engineering role in London:
job-boards.greenhouse.io/deepmind/job...
Please spread the word!

22.05.2025 15:11 β€” πŸ‘ 28    πŸ” 8    πŸ’¬ 1    πŸ“Œ 1

When faced with a challenge (like debugging) it helps to think back to examples of how you've overcome challenges in the past. Same for LLMs!

The method we introduce in this paper is efficient because examples are chosen for their complementarity, leading to much steeper inference-time scaling! πŸ§ͺ

20.03.2025 10:23 β€” πŸ‘ 18    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0
RLC Keynote speakers: Leslie Kaelbling, Peter Dayan, Rich Sutton, Dale Schuurmans, Joelle Pineau, Michael Littman

RLC Keynote speakers: Leslie Kaelbling, Peter Dayan, Rich Sutton, Dale Schuurmans, Joelle Pineau, Michael Littman

Some extra motivation for those of you in RLC deadline mode: our line-up of keynote speakers -- as all accepted papers get a talk, they may attend yours!

@rl-conference.bsky.social

24.02.2025 11:16 β€” πŸ‘ 36    πŸ” 10    πŸ’¬ 0    πŸ“Œ 1

200 great visualisations: 200 facets and nuances of 1 planetary story.

31.01.2025 13:41 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The sound of two users joining per second: "tik", "tok"...

30.01.2025 11:39 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
David Silver - Towards Superhuman Intelligence - RLC 2024
YouTube video by Reinforcement Learning Conference David Silver - Towards Superhuman Intelligence - RLC 2024

Reposting David Silver's talk about how RL is the way to intelligence. No particular reason
www.youtube.com/watch?v=pkpJ...

27.01.2025 23:32 β€” πŸ‘ 71    πŸ” 7    πŸ’¬ 0    πŸ“Œ 0
Announcement of Richard S. Sutton as RLC 2025 keynote speaker

Announcement of Richard S. Sutton as RLC 2025 keynote speaker

Excited to announce the first RLC 2025 keynote speaker, a researcher who needs little introduction, whose textbook we've all read, and who keeps pushing the frontier on RL with human-level sample efficiency

08.01.2025 15:03 β€” πŸ‘ 51    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0

Could language games (and playing many of them) be the renewable energy that Ilya was hinting at yesterday? They do address two core challenges of self-improvement -- let's discuss!

My talk is today at 11:40am, West Meeting Room 220-222, #NeurIPS2024
language-gamification.github.io/schedule/

14.12.2024 16:30 β€” πŸ‘ 27    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Don't get to talk enough about RL during #neurips2024? Then join us for more, tomorrow night at The Pearl!

10.12.2024 22:42 β€” πŸ‘ 14    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Dynamic programming has a fun origin story. In 1950, Bellman wanted to coin a term that "was something not even a Congressman could object to".
See here:
pubsonline.informs.org/doi/pdf/10.1...

09.12.2024 16:50 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This year's (first-ever) RL conference was a breath of fresh air! And now that it's established, the next edition is likely to be even better: Consider sending your best and most original RL work there, and then join us in Edmonton next summer!

02.12.2024 19:37 β€” πŸ‘ 19    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

Ohh... good morning to you too!

Clearly this got off on the wrong foot: do you want to try again, maybe more constructively (in the spirit of bluesky not being the other place)? This is a preprint, so I'd be happy to hear your suggestions for making it less "ignorant"...

02.12.2024 10:36 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Either one or many players. For "improvement" to be well-defined, one agent must be special (see footnote 6), but the multi-agent setting has many benefits.

30.11.2024 16:54 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Open-Endedness is Essential for Artificial Superhuman Intelligence In recent years there has been a tremendous surge in the general capabilities of AI systems, mainly fuelled by training foundation models on internetscale data. Nevertheless, the creation of openended...

1: open-ended means that it will keep producing novel and learnable artifacts (see the definition here: arxiv.org/abs/2406.04268), on the timescale of interest for the observer.

2: I think as a thought experiment it is valid, as it could work in principle, but of course it hasn't been built?

29.11.2024 12:32 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

In section 5 (second paragraph), there's about a dozen references to language games people are already using (one per paper), some with ingenious ways to provide feedback.

Also, I suspect the workshop will ultimately have the poster abstracts online with plenty of additional material!

28.11.2024 19:08 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Language Gamification Workshop 2024 Join us for the Language Gamification Workshop 2024, featuring keynote speeches, panel discussions, and poster sessions.

I'll also be giving a talk about this at the @neuripsconf.bsky.social workshop on "Language Gamification" in two weeks. Pop by if you're around!

language-gamification.github.io

28.11.2024 16:03 β€” πŸ‘ 17    πŸ” 1    πŸ’¬ 2    πŸ“Œ 0
Preview
Boundless Socratic Learning with Language Games An agent trained within a closed system can master any desired capability, as long as the following three conditions hold: (a) it receives sufficiently informative and aligned feedback, (b) its covera...

Are there limits to what you can learn in a closed system? Do we need human feedback in training? Is scale all we need? Should we play language games? What even is "recursive self-improvement"?

Thoughts about this and more here:
arxiv.org/abs/2411.16905

28.11.2024 16:01 β€” πŸ‘ 111    πŸ” 18    πŸ’¬ 7    πŸ“Œ 6

@colah.bsky.social: with a few years' hindsight, how do you see the Distill space now? Is there a chance for a reboot or a rebirth in another form?

28.11.2024 11:23 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Distill β€” Latest articles about machine learning Articles about Machine Learning

I think the Distill journal was really valuable in this space, but unfortunately is no longer around to help...

distill.pub

27.11.2024 13:42 β€” πŸ‘ 12    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Preview
POWER and PROGRESS - Massachusetts Institute of Technology Discover a bold reinterpretation of economics and history in Power and Progress, the latest book by bestselling authors Daron Acemoglu and Simon Johnson. Explore their 1000-year struggle over technolo...

If you're happy with a book-length answer (to the broader question on which technologies empower whom, why, and when), Acemoglu and Johnson have some excellent analysis:
shapingwork.mit.edu/power-and-pr...

25.11.2024 23:27 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Oh, this is my tribe!

Some other people here that I appreciate for their infectious positivity:
@akoopa.bsky.social
@jhamrick.bsky.social
@rockt.ai
@pcastr.bsky.social
@luisazintgraf.bsky.social
@dabelcs.bsky.social
@aditimavalankar.bsky.social

23.11.2024 10:35 β€” πŸ‘ 12    πŸ” 0    πŸ’¬ 4    πŸ“Œ 0

RLC will be held at the Univ. of Alberta, Edmonton, in 2025. I'm happy to say that we now have the conference's website out: rl-conference.cc/index.html

Looking forward to seeing you all there!

@rl-conference.bsky.social
#reinforcementlearning

22.11.2024 22:46 β€” πŸ‘ 60    πŸ” 19    πŸ’¬ 2    πŸ“Œ 2

Ok, we'll have to make sure a restricted the closed system generates an open-ended set of ideas then! πŸ˜‰

20.11.2024 10:10 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Now if only that pack could keep growing in, say, an open-ended way...

20.11.2024 09:45 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@togelius.bsky.social often has out-of-distribution takes -- but be warned, some of his thoughts come in book-length: mitpress.mit.edu/978026254934...

17.11.2024 19:37 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@schaul is following 20 prominent accounts