cartathomas - Bluesky Statics

🔔 Join our MAGELLAN talk on July 2!

We'll explore how LLM agents can monitor their own learning progress and choose what to learn next, like curious humans 🤔

1h presentation + 1h Q&A on autotelic agents & more!

📅 July 2, 4:30 PM CEST
🎟️ forms.gle/1PC2fxJx1PZYfqFr7

25.06.2025 15:14 — 👍 1 🔁 1 💬 0 📌 1

🚨New preprint🚨
When testing LLMs with questions, how can we know they did not see the answer in their training? In this new paper we propose a simple out of the box and fast method to spot contamination on short texts with @stepalminteri.bsky.social and Pierre-Yves Oudeyer !

15.11.2024 13:47 — 👍 9 🔁 4 💬 1 📌 0

🧭MAGELLAN is built on on many works that use of LP to drive automatic curriculum learning e.g. by @rockt.ai @egrefen.bsky.social @jeffclune.com @tomssilver.bsky.social @tambetm.bsky.social @jrsrichmond.bsky.social @ryanpsullivan.bsky.social

24.03.2025 15:09 — 👍 3 🔁 0 💬 0 📌 0

Thanks to @lorisgaven.bsky.social @clementromac.bsky.social for the fun time doing research on this topic and huge thanks also to, @ccolas.bsky.social Sylvain Lamprier, Olivier Sigaud and @pyoudeyer.bsky.social for their supervision!!

24.03.2025 15:09 — 👍 1 🔁 0 💬 1 📌 0

GitHub - flowersteam/MAGELLAN: MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces - flowersteam/MAGELLAN

github.com/flowersteam...

24.03.2025 15:09 — 👍 1 🔁 0 💬 1 📌 0

𝐐𝟒: 𝐀𝐝𝐚𝐩𝐭𝐚𝐭𝐢𝐨𝐧 𝐭𝐨 𝐄𝐯𝐨𝐥𝐯𝐢𝐧𝐠 𝐆𝐨𝐚𝐥 𝐒𝐩𝐚𝐜𝐞𝐬
We replaced the 𝐞𝐧𝐭𝐢𝐫𝐞 𝐠𝐨𝐚𝐥 𝐬𝐩𝐚𝐜𝐞 with unseen goals from the same categories. 🧭MAGELLAN generalized LP and retained exceptional performance—matching baselines that rely on human expertise! 🚀✨

24.03.2025 15:09 — 👍 1 🔁 0 💬 1 📌 0

𝐐𝟑: 𝐆𝐞𝐧𝐞𝐫𝐚𝐥𝐢𝐳𝐚𝐭𝐢𝐨𝐧
At the end of training, 🧭MAGELLAN has 𝐬𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞𝐝 𝐭𝐡𝐞 𝐠𝐨𝐚𝐥 𝐞𝐦𝐛𝐞𝐝𝐝𝐢𝐧𝐠 𝐬𝐩𝐚𝐜𝐞, consistently 𝐩𝐫𝐞𝐝𝐢𝐜𝐭𝐢𝐧𝐠 success probability 𝐟𝐨𝐫 𝐮𝐧𝐬𝐞𝐞𝐧 𝐠𝐨𝐚𝐥𝐬, a key step toward scalable open-ended learning!

24.03.2025 15:09 — 👍 1 🔁 0 💬 1 📌 0

$Evolution of the observed competence (SR) when evaluating policies on 64 training goals per category every 5000 episodes. We report the average SR over evaluated goals along with standard deviation (8 seeds). Icons indicate the average time step at which a method mastered a goal (i.e. SR $> 90\%$). We add stars to MAGELLAN, denoting significantly earlier mastery of a category compared to the method with the star's color (p-value $<8\times10^{-4}$). The dotted line (EK-Online-ALP) indicates that the method relies on expert knowledge.$

Evolution of the observed competence (SR) when evaluating policies on 64 training goals per category every 5000 episodes. We report the average SR over evaluated goals along with standard deviation (8 seeds). Icons indicate the average time step at which a method mastered a goal (i.e. SR $> 90\%$). We add stars to MAGELLAN, denoting significantly earlier mastery of a category compared to the method with the star's color (p-value $<8\times10^{-4}$). The dotted line (EK-Online-ALP) indicates that the method relies on expert knowledge.

𝐐𝟐: 𝐂𝐮𝐫𝐫𝐢𝐜𝐮𝐥𝐮𝐦 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠
🧭MAGELLAN autonomously discovers goal families (✊🌿🐮🦁) across 𝟐𝟓𝐤 𝐠𝐨𝐚𝐥𝐬, performing on par with expert knowledge-augmented baselines—but 𝐰𝐢𝐭𝐡𝐨𝐮𝐭 𝐫𝐞𝐪𝐮𝐢𝐫𝐢𝐧𝐠 𝐩𝐫𝐞𝐝𝐞𝐟𝐢𝐧𝐞𝐝 𝐠𝐨𝐚𝐥 𝐜𝐥𝐮𝐬𝐭𝐞𝐫𝐬! 🚀

24.03.2025 15:09 — 👍 1 🔁 0 💬 1 📌 0

🎯 𝐐𝟏: 𝐂𝐨𝐦𝐩𝐞𝐭𝐞𝐧𝐜𝐞 𝐄𝐬𝐭𝐢𝐦𝐚𝐭𝐢𝐨𝐧
🧭MAGELLAN matches expert baselines in estimating competence over tens of thousands of goals but with 𝐦𝐢𝐧𝐢𝐦𝐚𝐥 𝐜𝐨𝐬𝐭 & 𝐞𝐫𝐫𝐨𝐫! Unlike other methods, it efficiently tracks competence transfer across large goal spaces

24.03.2025 15:09 — 👍 1 🔁 0 💬 1 📌 0

We studied 4 scientific questions:
Q1 How does 🧭MAGELLAN's competence estimation compare to classical approaches?
Q2 Can it be used to build an efficient curriculum?
Q3 Can it generalize on unseen goals?
Q4 Can it adapt to an evolving goal space?
Let's dive in! 👇

24.03.2025 15:09 — 👍 1 🔁 0 💬 1 📌 0

By capturing semantic relationships between goals, 🧭MAGELLAN enables efficient 𝐋𝐏 𝐞𝐬𝐭𝐢𝐦𝐚𝐭𝐢𝐨𝐧 & adaptive goal prioritization—all 𝐰𝐢𝐭𝐡𝐨𝐮𝐭 𝐫𝐞𝐥𝐲𝐢𝐧𝐠 𝐨𝐧 𝐞𝐱𝐩𝐞𝐫𝐭-𝐝𝐞𝐟𝐢𝐧𝐞𝐝 𝐠𝐫𝐨𝐮𝐩𝐢𝐧𝐠𝐬! 🔥 #CurriculumLearning

24.03.2025 15:09 — 👍 1 🔁 0 💬 1 📌 0

Our LLM agent uses 🧭MAGELLAN to estimate past & current competence, computing 𝐚𝐛𝐬𝐨𝐥𝐮𝐭𝐞 𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 𝐩𝐫𝐨𝐠𝐫𝐞𝐬𝐬 (𝐀𝐋𝐏) for each goal. The agent then selects goals that maximize ALP, learning efficiently via online RL. 🚀 #ReinforcementLearning #LLM

24.03.2025 15:09 — 👍 0 🔁 0 💬 1 📌 0

🌍 Humans thrive in open-ended exploration, but AI struggles with infinite goal spaces. Learning progress (LP) helps, but scaling it is tough! 🧭MAGELLAN tackles this by efficiently generalising LP to goals not practice, allowing the agent to navigate large and complex domains🚢

24.03.2025 15:09 — 👍 1 🔁 0 💬 1 📌 0

MAGELLAN: Metacognitive predictions of learning progress guide... Open-ended learning agents must efficiently prioritize goals in vast possibility spaces, focusing on those that maximize learning progress (LP). When such autotelic exploration is achieved by LLM...

🚀 Introducing 🧭MAGELLAN—our new metacognitive framework for LLM agents! It predicts its own learning progress (LP) in vast natural language goal spaces, enabling efficient exploration of complex domains.🌍✨Learn more: 🔗 arxiv.org/abs/2502.07709 #OpenEndedLearning #LLM #RL

24.03.2025 15:09 — 👍 9 🔁 3 💬 1 📌 4

Posts by (@cartathomas.bsky.social)