Gautier Hamon @hamongautier

Complex cell-like structures in Flow Lenia

02.04.2025 11:52 — 👍 8 🔁 2 💬 0 📌 0

MAGELLAN: Metacognitive predictions of learning progress guide... Open-ended learning agents must efficiently prioritize goals in vast possibility spaces, focusing on those that maximize learning progress (LP). When such autotelic exploration is achieved by LLM...

🚀 Introducing 🧭MAGELLAN—our new metacognitive framework for LLM agents! It predicts its own learning progress (LP) in vast natural language goal spaces, enabling efficient exploration of complex domains.🌍✨Learn more: 🔗 arxiv.org/abs/2502.07709 #OpenEndedLearning #LLM #RL

24.03.2025 15:09 — 👍 9 🔁 3 💬 1 📌 4

we are recruiting interns for a few projects with @pyoudeyer
in bordeaux
> studying llm-mediated cultural evolution with @nisioti_eleni
@Jeremy__Perez

> balancing exploration and exploitation with autotelic rl with @ClementRomac

details and links in 🧵
please share!

27.11.2024 17:43 — 👍 6 🔁 6 💬 1 📌 0

1e9 steps on craftax with transformerXL PPO

4e9 steps on craftax with transformerXL PPO

8/ For the curious, here are the achievements success rate on craftax across training, training for 1e9 steps (left) and training for 4e9 steps (right).

22.11.2024 10:15 — 👍 1 🔁 0 💬 0 📌 0

GitHub - Reytuag/transformerXL_PPO_JAX Contribute to Reytuag/transformerXL_PPO_JAX development by creating an account on GitHub.

7/ The JAX ecosystem in RL is currently blooming with wonderful open-sources projects from others that I linked at the bottom of the repository. github.com/Reytuag/tran...
This work was done at @FlowersINRIA
.
Also feel free to reach me if you have questions or suggestions !

22.11.2024 10:15 — 👍 0 🔁 0 💬 1 📌 0

6/ Potential next steps could be to test it on Xland-Minigrid
, to test it on an Open-Ended meta-RL environment github.com/dunnolab/xla...
I'm also curious to implement Muesli (arxiv.org/abs/2104.06159) with transformerXL as in arxiv.org/abs/2301.07608

22.11.2024 10:15 — 👍 0 🔁 0 💬 1 📌 0

5/Here is the training curve obtained from training for 1e9 steps, reporting the scores from PPO and PPO-RNN provided in the craftax repo.
Noting that PPO-RNN was already beating other baselines with Unsupervised Environment Design and intrinsic motivation. arxiv.org/pdf/2402.16801

22.11.2024 10:15 — 👍 2 🔁 0 💬 1 📌 0

GitHub - MichaelTMatthews/Craftax: (Crafter + NetHack) in JAX. ICML 2024 Spotlight. (Crafter + NetHack) in JAX. ICML 2024 Spotlight. Contribute to MichaelTMatthews/Craftax development by creating an account on GitHub.

4/ Testing it on the challenging Craftax from github.com/MichaelTMatt...
(with little hyperparameter tuning), it obtained higher returns in 1e9 steps than PPO-RNN.
Training it for longer, led to the 3rd floor in craftax, making it the first to get advanced achievements.

22.11.2024 10:15 — 👍 0 🔁 0 💬 1 📌 0

3/
Training a 3M parameters Transformer for 1e6 steps in MemoryChain-bsuite (from gymnax) takes 10s on a A100. (with 512 env)
Training a 5M parameters Transformer for 1e9 steps in craftax takes ~6h on a single A100. (with 1024 envs)
We also support multi-GPU training.

22.11.2024 10:15 — 👍 0 🔁 0 💬 1 📌 0

Stabilizing Transformers for Reinforcement Learning Owing to their ability to both effectively integrate information over long time horizons and scale to massive amounts of data, self-attention architectures have recently shown breakthrough success in ...

2/ We implement TransformerXL-PPO following "Stabilizing Transformers for Reinforcement
Learning" arxiv.org/abs/1910.06764
The code follows the template from PureJaxRL github.com/luchris429/p...
⚡️Training is fast thanks to JAX

22.11.2024 10:15 — 👍 0 🔁 0 💬 1 📌 0

1/⚡️Looking for a fast and simple Transformer baseline for your RL environment in JAX ?
Sharing my implementation of transformerXL-PPO: github.com/Reytuag/tran...
The implementation is the first to attain the 3rd floor and obtain advanced achievements in the challenging Craftax

22.11.2024 10:15 — 👍 3 🔁 1 💬 1 📌 0

The video encoding might not do it full justice.
Paper: direct.mit.edu/isal/proceed...

22.11.2024 10:04 — 👍 0 🔁 0 💬 0 📌 0

22.11.2024 10:00 — 👍 0 🔁 0 💬 1 📌 0

Putting some Flow Lenia here too

22.11.2024 09:51 — 👍 4 🔁 1 💬 1 📌 0

Now that @jeffclune.bsky.social and @joelbot3000.bsky.social are here, time for an Open-Endedness starter pack.

go.bsky.app/MdVxrtD

20.11.2024 07:08 — 👍 105 🔁 32 💬 16 📌 5

🚨New preprint🚨
When testing LLMs with questions, how can we know they did not see the answer in their training? In this new paper we propose a simple out of the box and fast method to spot contamination on short texts with @stepalminteri.bsky.social and Pierre-Yves Oudeyer !

15.11.2024 13:47 — 👍 9 🔁 4 💬 1 📌 0

Gautier Hamon

Latest posts by hamongautier.bsky.social on Bluesky

@hamongautier is following 19 prominent accounts