Tom Silver @tomssilver - Bluesky Profile

This week's #PaperILike is "The Utility of Temporal Abstraction in Reinforcement Learning" (Jong et al., AAMAS 2008).

My favorite underrated paper in hierarchical RL. Unpacks how options can help *or hurt* learning performance. Fun writing.

PDF: www.ifaamas.org/Proceedings/...

03.08.2025 12:30 — 👍 2 🔁 0 💬 0 📌 0

This week's #PaperILike is "Width and Serialization of Classical Planning Problems" (Lipovetzky & Geffner, ECAI 2012).

If you only read a few classical planning papers, this should be one! Illuminating and practically useful.

PDF: www-i6.informatik.rwth-aachen.de/~hector.geff...

27.07.2025 13:02 — 👍 1 🔁 1 💬 0 📌 0

Stop! Planner Time: Metareasoning for Probabilistic Planning Using Learned Performance Profiles | Proceedings of the AAAI Conference on Artificial Intelligence

This week's #PaperILike is "Stop! Planner Time: Metareasoning for Probabilistic Planning Using Learned Performance Profiles" (Budd et al., AAAI 2024).

Metareasoning is increasingly important as we continue to make progress on "reasoning."

PDF: ojs.aaai.org/index.php/AA...

20.07.2025 13:44 — 👍 2 🔁 0 💬 0 📌 0

PushWorld: A benchmark for manipulation planning with tools and movable obstacles While recent advances in artificial intelligence have achieved human-level performance in environments like Starcraft and Go, many physical reasoning tasks remain challenging for modern algorithms. To...

This week's #PaperILike is "PushWorld: A benchmark for manipulation planning with tools and movable obstacles" (Kansky et al., 2023).

Fans of benchmarks like ARC will enjoy the simple mechanics and the difficult reasoning required.

PDF: arxiv.org/abs/2301.10289

13.07.2025 12:10 — 👍 6 🔁 0 💬 0 📌 0

This week's #PaperILike is "Effort Level Search in Infinite Completion Trees with Application to Task-and-Motion Planning" (Toussaint et al., ICRA 2024).

Addresses the meta-reasoning challenge that is core to TAMP. Toussaint is always worth a read.

PDF: www.user.tu-berlin.de/mtoussai/24-...

06.07.2025 14:19 — 👍 3 🔁 1 💬 0 📌 0

The Power of Resets in Online Reinforcement Learning Simulators are a pervasive tool in reinforcement learning, but most existing algorithms cannot efficiently exploit simulator access -- particularly in high-dimensional domains that require general fun...

This week's #PaperILike is "The Power of Resets in Online Reinforcement Learning" (Mhammedi et al., 2024).

If you're doing RL in sim, why not use the sim to its full potential? Reset to any state! (gym.Env.reset() is not all we need.)

PDF: arxiv.org/abs/2404.15417

29.06.2025 13:08 — 👍 5 🔁 2 💬 0 📌 0

This week's #PaperILike is "Learning over Subgoals for Efficient Navigation of Structured, Unknown Environments" (Stein et al., CoRL 2018).

A highly original combination of learning + planning that is still underrated (despite winning a CoRL award!)

PDF: proceedings.mlr.press/v87/stein18a...

22.06.2025 14:30 — 👍 3 🔁 1 💬 0 📌 0

YouTube video by Valentin Hartmann Long-Horizon Multi-Robot Rearrangement Planning for Construction Assembly

This week's #PaperILike is "Long-Horizon Multi-Robot Rearrangement Planning for Construction Assembly" (Hartmann et al., TRO 2022).

Take two minutes to watch this video: www.youtube.com/watch?v=Gqho...

I don't use a lot of emojis, but 🤯

PDF: arxiv.org/abs/2106.02489

15.06.2025 12:50 — 👍 12 🔁 2 💬 0 📌 0

From Real World to Logic and Back: Learning Generalizable Relational Concepts For Long Horizon Robot Planning Humans efficiently generalize from limited demonstrations, but robots still struggle to transfer learned knowledge to complex, unseen tasks with longer horizons and increased complexity. We propose th...

This week's #PaperILike is "From Real World to Logic and Back: Learning Generalizable Relational Concepts For Long Horizon Robot Planning" (Shah et al., 2024).

A fresh & clever approach with very impressive few-shot generalization results.

PDF: arxiv.org/abs/2402.11871

08.06.2025 12:22 — 👍 5 🔁 0 💬 0 📌 0

This week's #PaperILike is "Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models" (Lamb et al., 2022).

Part of an exciting line of work: sites.google.com/view/agent-i...

This one has an awesome related work section.

PDF: arxiv.org/abs/2207.08229

01.06.2025 11:51 — 👍 3 🔁 0 💬 0 📌 0

Grounding Language Plans in Demonstrations Through Counterfactual Perturbations Grounding the common-sense reasoning of Large Language Models (LLMs) in physical domains remains a pivotal yet unsolved problem for embodied AI. Whereas prior works have focused on leveraging LLMs dir...

This week's #PaperILike is "Grounding Language Plans in Demonstrations Through Counterfactual Perturbations" (Wang et al., ICLR 2024).

A very ideas-rich paper that combines mode-based planning, LLMs, abstraction, few-shot learning, and real robots!

PDF: arxiv.org/abs/2403.17124

25.05.2025 14:58 — 👍 0 🔁 0 💬 0 📌 0

🧵1/ New paper! 📄 InnateCoder: Learning Programmatic Options with Foundation Models

This is Rubens Moraes' final chapter of his PhD thesis from Universidade Federal de Viçosa, Brazil, in collaboration with Quazi Sadmine and Hendrik Baier.

arXiv: arxiv.org/abs/2505.12508

23.05.2025 20:31 — 👍 5 🔁 3 💬 1 📌 0

Did something today that I never expected to do: made a donation to Harvard!

23.05.2025 18:41 — 👍 1 🔁 0 💬 0 📌 0

Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary Settings This paper introduces a new approach for continual planning and model learning in relational, non-stationary stochastic environments. Such capabilities are essential for the deployment of sequential d...

This week's #PaperILike is "Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary Settings" (Karia et al., ICAPS 2024).

A sophisticated approach to a hard & realistic problem. See also their other nice works on RMDPs.

PDF: arxiv.org/abs/2402.08145

18.05.2025 13:37 — 👍 2 🔁 0 💬 0 📌 0

Meta-Optimization and Program Search using Language Models for Task and Motion Planning Intelligent interaction with the real world requires robotic agents to jointly reason over high-level plans and low-level controls. Task and motion planning (TAMP) addresses this by combining symbolic...

This week's #PaperILike is "Meta-Optimization and Program Search using Language Models for Task and Motion Planning" (Shcherba et al., 2025).

I don't often post such new papers, but I'm very excited to see more TAMP + LLM-based program synthesis.

PDF: arxiv.org/abs/2505.03725

11.05.2025 16:35 — 👍 4 🔁 0 💬 0 📌 0

Creative Robot Tool Use with Large Language Models Tool use is a hallmark of advanced intelligence, exemplified in both animal behavior and robotic capabilities. This paper investigates the feasibility of imbuing robots with the ability to creatively ...

This week's #PaperILike is "Creative Robot Tool Use with Large Language Models" (Xu et al., 2023).

Very fun to see real robots solving physics puzzles (with tool use!)

PDF: arxiv.org/abs/2310.13065

03.05.2025 21:29 — 👍 1 🔁 0 💬 0 📌 0

This week's #PaperILike is "Deep Reinforcement Learning that Matters" (Henderson et al., AAAI 2018).

A great primer on how to do deep RL rigorously. Among the papers that I share the most, especially in reviews!

PDF: arxiv.org/pdf/1709.06560

27.04.2025 14:19 — 👍 5 🔁 0 💬 0 📌 0

This week's #PaperILike is "Navigation Among Movable Obstacles" (Stilman & Kuffner, 2005).

An important early paper that informed a lot of subsequent work in task and motion planning (TAMP). Also, pretty impressive figures for 2005!

PDF: www.golems.org/papers/Stilm...

20.04.2025 15:16 — 👍 2 🔁 0 💬 0 📌 0

FINALLY.

Some gumption from the President of Harvard today:

14.04.2025 17:47 — 👍 107 🔁 15 💬 2 📌 0

Online algorithms for POMDPs with continuous state, action, and observation spaces Online solvers for partially observable Markov decision processes have been applied to problems with large discrete state spaces, but continuous state, action, and observation spaces remain a challeng...

This week's #PaperILike is "Online Algorithms for POMDPs with Continuous State, Action, and Observation Spaces" (Sunberg & Kochenderfer, ICAPS 2018).

POMDPs are hard to understand and hard to solve. This paper helps with both!

PDF: arxiv.org/abs/1709.06196

13.04.2025 13:59 — 👍 19 🔁 3 💬 0 📌 0

Opinion | A Playbook for Law Firms and Colleges to Stand Up to President Trump Law firms and universities do not need to capitulate. Here’s how they can fight back.

Thread: A surprisingly strong NYT editorial:
www.nytimes.com/2025/04/06/o...
"the most likely path to American autocracy depends on not only a power-hungry president but also the voluntary capitulation of a cowed civil society. It depends on the mistaken belief that a president is invincible."...

06.04.2025 13:18 — 👍 47 🔁 24 💬 1 📌 0

This week's #PaperILike is "Control-Limited Differential Dynamic Programming" (Tassa et al., ICRA 2014).

This paper has an extremely clear and concise overview of trajectory optimization for robotics, especially DDP.

PDF: www.roboti.us/lab/papers/T...

06.04.2025 12:49 — 👍 0 🔁 0 💬 0 📌 0

This week's #PaperILike is "Impossibly Good Experts and How to Follow Them" (Walsman et al., ICLR 2023).

A very well written paper that will be especially interesting if you're using teacher-student training in your work.

PDF: openreview.net/pdf?id=sciA_...

30.03.2025 15:49 — 👍 0 🔁 0 💬 0 📌 0

Synthesizing world models for bilevel planning Modern reinforcement learning (RL) systems have demonstrated remarkable capabilities in complex environments, such as video games. However, they still fall short of achieving human-like sample efficie...

arxiv.org/abs/2503.20124

27.03.2025 00:47 — 👍 21 🔁 5 💬 0 📌 1

This week's #PaperILike is "Gaussian Process Implicit Surfaces for Shape Estimation and Grasping" (Dragiev et al., ICRA 2011).

Useful if you're thinking about implicit representations for manipulation. Also +1 for uncertainty quantification.

PDF: argmin.lis.tu-berlin.de/papers/11-dr...

23.03.2025 16:32 — 👍 5 🔁 0 💬 1 📌 0

Simple random search of static linear policies is competitive for reinforcement learning

This week's #PaperILike is "Simple random search of static linear policies is competitive for RL" (Mania et al., NeurIPS 2018).

"Simple baselines should be established before moving forward to more complex [ones]." Generally agree!

PDF: proceedings.neurips.cc/paper/2018/h...

16.03.2025 16:34 — 👍 7 🔁 2 💬 1 📌 0

A Tour of Reinforcement Learning: The View from Continuous Control This manuscript surveys reinforcement learning from the perspective of optimization and control with a focus on continuous control applications. It surveys the general formulation, terminology, and ty...

This week's #PaperILike is "A Tour of Reinforcement Learning: The View from Continuous Control" (Recht 2018).

Pairs well with the PaperILiked last week -- another good bridge between RL and control theory.

PDF: arxiv.org/abs/1806.09460

09.03.2025 15:32 — 👍 7 🔁 1 💬 0 📌 0

Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming In this paper we describe a new conceptual framework that connects approximate Dynamic Programming (DP), Model Predictive Control (MPC), and Reinforcement Learning (RL). This framework centers around ...

This week's #PaperILike is "Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming" (Bertsekas 2024).

If you know 1 of {RL, controls} and want to understand the other, this is a good starting point.

PDF: arxiv.org/abs/2406.00592

02.03.2025 16:19 — 👍 43 🔁 8 💬 0 📌 0

This week's #PaperILike (well, book) is "The Structure of Scientific Revolutions" (Kuhn, 1962).

Zooming way, way out and thinking about what we're all doing here. (I'll get back to robots & AI next week.)

PDF (but also buy it): www.lri.fr/~mbl/Stanfor...

23.02.2025 14:40 — 👍 3 🔁 0 💬 0 📌 0

This week's #PaperILike is "Learning Reusable Manipulation Strategies" (Mao et al., CoRL 2023).

This paper is my favorite recent account of how robots can learn & dramatically generalize "tricks" or "mechanisms" from very little data.

PDF: arxiv.org/pdf/2311.03293

16.02.2025 17:52 — 👍 1 🔁 0 💬 0 📌 0

Tom Silver

Latest posts by tomssilver.bsky.social on Bluesky

@tomssilver is following 20 prominent accounts