Florent Delgrange florentdelgrange

If you want to know more about this vision, check out the paper or come talk to me at AAMAS in May!

03.03.2026 15:03 — 👍 0 🔁 0 💬 0 📌 0

The framework unifies reinforcement learning, formal verification, and reactive synthesis to deliver world models that can be checked and queried, enabling agents to synthesize verifiable programs, learn new policies quickly, and maintain correctness while adapting to novelty.

03.03.2026 15:03 — 👍 0 🔁 0 💬 1 📌 0

This 'Blue Sky Idea' paper lays out a research agenda for foundation world models: persistent and compositional world models designed to support verification and adaptation of learning agents in a single, principled loop.

03.03.2026 15:03 — 👍 0 🔁 0 💬 1 📌 0

Glad to share that my paper,
“Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments”
has been accepted at AAMAS 2026!

📄 Paper: arxiv.org/abs/2602.23997

03.03.2026 15:03 — 👍 0 🔁 0 💬 1 📌 0

ALA 2026

📢 Deadline Extended!
The submission deadline for the Adaptive and Learning Agents (ALA) Workshop at #AAMAS2026 (Paphos, Cyprus 🇨🇾) has been extended!
🗓️ Feb 26, 2026
alaworkshop2026.github.io

29.01.2026 01:59 — 👍 2 🔁 2 💬 0 📌 0

Looking forward to discussing this at ICLR 🇧🇷!

29.01.2026 13:16 — 👍 0 🔁 0 💬 0 📌 0

We study safe policy improvement in general state spaces by combining world models, representation learning, and careful policy updates. We link representation quality and model prediction loss to safe updates, and introduce DeepSPI, an on-policy algorithm with strong empirical performance.

29.01.2026 13:16 — 👍 0 🔁 0 💬 1 📌 0

RL relies on representation learning to handle complex observations. Alongside policies & value functions, agents may learn world models to support planning or improve sample efficiency. They are learned jointly and evolve, so it is crucial to ensure improvement with policy updates.

29.01.2026 13:16 — 👍 0 🔁 0 💬 1 📌 0

🎉 Excited to share that our paper, “Deep SPI: Safe Policy Improvement via World Models,” has been accepted to ICLR 2026 🇧🇷!

📄 full paper: arxiv.org/abs/2510.12312
🤝 with @raphael.avalos.fr and @willemropke.bsky.social

🧵

29.01.2026 13:16 — 👍 0 🔁 0 💬 1 📌 0

Really looking forward to ALA @ AAMAS 2026! Glad to be co-organizing this edition. If you’re working on adaptive & learning agents, I hope to see you in Paphos!

20.01.2026 15:58 — 👍 1 🔁 0 💬 0 📌 0

Mind the GAP!

we've had a few works proposing techniques for enabling scaling in deep rl, such as MoEs, tokenization, & sparse training.
ghada sokar and i looked further & found a bit more clarity into *what* enables scaling, leading us to simpler solutions (see GAP in figure)!
1/

26.05.2025 16:31 — 👍 34 🔁 7 💬 1 📌 0

PhD Position in Causal Agent-based Modelling of Complex Social Systems Join this exciting interdisciplinary research project at the Centre for Complex Systems Studies and study causal agent-based modelling!

🎓 PhD position available!

Join our interdisciplinary research project on causal agent-based modelling!

🔍 Looking for curious minds with a MSc degree (or near to completing one) in CS/AI/related fields.
📍 Location: Utrecht University, NL
🗓️ Deadline: 16 June 2025
📩 Info: www.uu.nl/en/organisat...

13.05.2025 19:31 — 👍 8 🔁 5 💬 0 📌 0

5/ This work is a first step towards improving the reliability of learning agents by unifying RL and reactive synthesis.

I'm very grateful to my co-authors for this great collaboration!
Check my blogpost for more insights!

I'll present the paper in a few weeks at @aamasconf.bsky.social

05.05.2025 16:20 — 👍 0 🔁 0 💬 0 📌 0

4/ This approach allows for
- a separation of concerns
- formal guarantees through (PAC) bounds on both the world model quality and policy performance
- reusability and scaling to domains where synthesis was not applicable

05.05.2025 16:20 — 👍 0 🔁 0 💬 1 📌 0

3/ Given the map, the learned low-level models/policies, and a formal specification describing what the agent should do or not, we apply reactive synthesis to obtain a high-level planner.

05.05.2025 16:20 — 👍 0 🔁 0 💬 1 📌 0

2/ We consider scenarios where a "map" describing the environment's high-level structure can be provided as a graph. Each vertex is a "room," where we apply RL to get low-level policies. In addition, we learn a world model of each room that can be formally verified.

05.05.2025 16:20 — 👍 0 🔁 0 💬 1 📌 0

1/ RL enables agents to learn efficient policies in complex domains, but lacks formal guarantees — a challenge in high-stakes scenarios. In contrast, when the environment model is accessible, reactive synthesis offers formal guarantees, but struggles to scale.

05.05.2025 16:20 — 👍 0 🔁 0 💬 1 📌 0

Composing Reinforcement Learning Policies, with Formal Guarantees | Florent Delgrange Synthesizing controllers in large domains from verified world models and reinforcement learning policy composition.

Happy to share our new paper (AAMAS 2025)!
We combine reinforcement learning 🤖🧠 & reactive synthesis ⚙️ for learning scalable safe policies in complex tasks with formal guarantees.

📑paper: arxiv.org/abs/2402.13785
✍️blogpost: delgrange.me/post/composi...

A thread🧵⤵️

05.05.2025 16:20 — 👍 3 🔁 1 💬 1 📌 0

ALA 2025

Still 7 days to submit your work to the ALA workshop at AAMAS! We welcome full papers, work in progress, and 2-page abstracts of recently published journal papers. All the info is available at ala-workshop.github.io.

28.01.2025 18:06 — 👍 2 🔁 1 💬 0 📌 1

📌

25.11.2024 13:04 — 👍 0 🔁 0 💬 0 📌 0

Thanks!

25.11.2024 12:16 — 👍 1 🔁 0 💬 0 📌 0

Hi, I'd be pleased if you could add me too if there's still room :-)

25.11.2024 11:03 — 👍 1 🔁 0 💬 1 📌 0

Thanks!

24.11.2024 14:25 — 👍 0 🔁 0 💬 0 📌 0

Hey, I’d love to be added!

24.11.2024 14:15 — 👍 0 🔁 0 💬 1 📌 0

Empirical Design in Reinforcement Learning Empirical design in reinforcement learning is no small task. Running good experiments requires attention to detail and at times significant computational resources. While compute resources available p...

Another must read for reinforcement learning. Answers many key questions for researchers;
-Do I need multiple training runs?
-How do I report model confidence?
-And a great section on common mistakes to fend off reviewer 2
🧪
#DRL
#reinforcementlearning
#AI
arxiv.org/abs/2304.01315

22.11.2024 07:25 — 👍 38 🔁 5 💬 1 📌 0

In addition to the Deep Learning Theory starter pack, I've also put together a starter pack for Reinforcement Learning Theory. Let me know if you'd like to be included or suggest someone to add to the list!

go.bsky.app/LWyGAAu

22.11.2024 21:56 — 👍 29 🔁 10 💬 11 📌 1

hey! working on RL and formal verification, do you mind adding me? :-)

22.11.2024 10:55 — 👍 0 🔁 0 💬 0 📌 0

If you're an RL researcher or RL adjacent, pipe up to make sure I've added you here!
go.bsky.app/3WPHcHg

09.11.2024 16:42 — 👍 70 🔁 26 💬 52 📌 0

Posts by Florent Delgrange (@florentdelgrange.bsky.social)