@bidiptas13 - Bluesky Profile

Check out our new work on autonomous driving in new cities with map data + MARL!

21.02.2026 14:06 — 👍 1 🔁 0 💬 0 📌 0

Evolution Strategies at the Hyperscale General ML Training Made as Fast and Easy as Inference

Evolve at the hyperscale!
Work co-led with Mattie Fellows and Juan Agustin Duque.
Made possible by #Isambard and AIRR

🌐 Website: eshyperscale.github.io
📝 Paper: alphaxiv.org/abs/2511.16652
💻 Code: github.com/ESHyperscale...
🥚NanoEgg : github.com/ESHyperscale... (train in int 😉)

21.11.2025 17:56 — 👍 3 🔁 0 💬 0 📌 0

Scaling LLM Reasoning with EGGROLL 🥚🧠📝

Using 🥚 to finetune RWKV-7 language models outperforms GRPO on Countdown and GSM8K ❗

🥚significantly outperformed GRPO on the Countdown task, achieving a 35% validation accuracy compared to GRPO's 23%❗

21.11.2025 17:56 — 👍 0 🔁 0 💬 1 📌 0

EGGROLL 🥚for RL 🎮🤖

🥚 is competitive with, and in many cases, better than OpenES performance, even before considering the vast speed-up!

🥚 matched OpenES on 7/16 environments and outperformed it on another 7/16

🥚's low-rank approach does not compromise ES performance

21.11.2025 17:56 — 👍 0 🔁 0 💬 1 📌 0

🥚EGGROLLing in the Deep with🚀 💯✕ Speedup

🥚 speed nearly reaches the throughput of pure batch inference, leaving OpenES far behind

🥚 reaches 91% of pure batch inference speed vs. OpenES reaching only 0.41%

21.11.2025 17:56 — 👍 0 🔁 0 💬 1 📌 0

The EGGROLL Recipe
🧠🛠️ We replace full-rank perturbations with low-rank ones. Each update is still high rank, maintaining expressivity with faster training

🥚 EGGROLL converges to the full-rank update at a fast rate of 1/rank. The method is effective even with a rank of 1

21.11.2025 17:56 — 👍 0 🔁 0 💬 1 📌 0

We use EGGROLL 🥚to train RNN language models from scratch using only integer datatypes (and no activation functions!), scaling population size from 64 to 262144

2 (🐔🐔) orders of magnitude larger than prior ES works❗

21.11.2025 17:56 — 👍 0 🔁 0 💬 1 📌 0

Introducing 🥚EGGROLL 🥚(Evolution Guided General Optimization via Low-rank Learning)! 🚀 Scaling backprop-free Evolution Strategies (ES) for billion-parameter models at large population sizes

⚡100x Training Throughput
🎯Fast Convergence
🔢Pure Int8 Pretraining of RNN LLMs

21.11.2025 17:56 — 👍 26 🔁 8 💬 1 📌 4

Latest posts by bidiptas13.bsky.social on Bluesky

@bidiptas13 is following 10 prominent accounts