The challenge of hidden gifts is more difficult with more agents but still solvable
16.10.2025 17:53 — 👍 1 🔁 0 💬 0 📌 0@dvnxmvlhdf5.bsky.social
MSc. @mila-quebec.bsky.social and @mcgill.ca in the LiNC lab Fixating on multi-agent RL, Neuro-AI and decisions Ēka ē-akimiht https://danemalenfant.com/
The challenge of hidden gifts is more difficult with more agents but still solvable
16.10.2025 17:53 — 👍 1 🔁 0 💬 0 📌 0Scaling the self-correction term is functionally as the more agents that contribute to a reward increase the gradient order for inferring how those agents effect your Q-value estimate of your collective sub-policy. All still computed in O(N) time where N is the number of agents.
16.10.2025 17:53 — 👍 1 🔁 0 💬 1 📌 0To follow up on the asymptotic proof that the self-correction term works with any number of agents or coalitions, here are the results for 3 agents.
Policy gradient agent's performance suffers with more agents but self correction still stabilizes learning arxiv.org/abs/2505.20579
In 2018, Charles Murray challenged me to a bet: "We will understand IQ genetically—I think most of the picture will have been filled in by 2025—there will still be blanks—but we’ll know basically what’s going on." It's now 2025, and I claim a win. I write about it in The Atlantic.
13.10.2025 13:33 — 👍 316 🔁 118 💬 10 📌 18So that’s what model welfare is for
14.10.2025 22:40 — 👍 0 🔁 0 💬 0 📌 0'This study will examine the impacts of the unstated paternity policy and the gradual decrease of status Indians in some First Nations bands,' says Battiste.
www.aptnnews.ca?p=278355
Here’s a music video that I made for Danger Mouse & Black Thought.
youtu.be/LmK978jFxMk?...
I am seeing a lot of people reposting Lakota Man today bc it's Indigenous Peoples Day. A reminder that he is not well liked amongst many (most?) Natives on social media. Many of us blocked him a long time ago. Some of the reasons why are in this article.
www.dailydot.com/irl/lakotama...
Large buffalo rubbing stone in the foreground of a wide landscape shot of prairie and blue sky.
Thousands of snow geese on a prairie lake.
Tall metal cut-out sign in front of a wooden rail fence with brown prairie and blue sky behind. Sign marks the head of the Grasslands Nature Trail in Last Mountain Lake National Wildlife Area.
Brown grass stretches to the horizon under a blue sky with wispy clouds. There is a fence on the horizon.
Hard to believe I’m watching snow fall now after a day like yesterday. Beautiful walk full of amazing nature encounters in the oldest migratory bird sanctuary in North America. Saw a few snow geese 😂. I’ll share more after I get through the photos! 🌿 #birds #prairie
12.10.2025 15:56 — 👍 221 🔁 17 💬 12 📌 0Cat
Here is my plan to make Bluesky more fun and active:
08.10.2025 00:06 — 👍 3 🔁 0 💬 0 📌 0NEW on our #DeeperLearning blog
People balance being kind vs. being honest — and #LLMs should too.
New research shows training choices often favor informativeness over kindness, but prompting can induce sycophancy.
Read more: bit.ly/3Wqrtxl
8/8
Further context in this recap captures the AI Ecologies Lab: hacnumedia.org/creer-avec-l..., raav.org/actuality/qu... and lienmultimedia.com/spip.php?art... . The festival is mutek.org
7/8 The takeaway for the public: consider training choices like entropy regularization can make systems more robust so fewer restarts and less costly retraining when the world shifts. This means your learning systems are more durable and efficient.
07.10.2025 18:33 — 👍 1 🔁 0 💬 1 📌 06/8 To make it more visually fun), I teamed up with the Société des arts technologiques sat.qc.ca to create an experience. Using open-source Ossia Score's particle clouds, audio, and 3D transforms in real time while the agents learned. ossia.io
07.10.2025 18:33 — 👍 1 🔁 0 💬 1 📌 05/8
Both agents must unlearn and relocate the reward peak. The entropy-max agent stays a bit uncertain, keeps exploring, so it detects the shift faster and adapts sooner.
4/8
To communicate this to a general audience and the #art community, I built a minimal task: two Gaussian bandits. One agent optimizes with entropy; the other doesn’t. Mid-training, the reward distribution jumps.
3/8
By training systems this way, agents should handle non-stationary changes better. Yet outside research circles, “AI” ≈ only LLMs or generative models. RL, on the other hand, is an unknown learning paradigm to the public’s eyes.
Photo by Félix Bonne-Vie
2/8
I proposed a reinforcement-learning (RL) demo: add a maximum-entropy term to increase the longevity of systems in a non-stationary environment. This is well known to the RL research community: openreview.net/forum?id=PtS...
(photo by Félix Bonne-Vie)
1/8
A month ago I wrapped a 4-month project with MuTek Forum’s AI Ecologies Lab led by Sarah Mackenzie: the research arm of Montréal’s 25-year electronic music festival. Why entropy can make AI more resilient Event: ra.co/events/2206981
My eye colour apparently changed after 6 years
03.10.2025 00:07 — 👍 1 🔁 0 💬 0 📌 0We're finally out of stealth: percepta.ai
We're a research / engineering team working together in industries like health and logistics to ship ML tools that drastically improve productivity. If you're interested in ML and RL work that matters, come join us 😀
I am on one transformer paper from 3 years ago and ICLR flooded my bids with RLVR & RLHF :S
01.10.2025 21:11 — 👍 0 🔁 0 💬 0 📌 0w/ James Cohan, @jacobeisenstein.bsky.social, and Kristina Toutanova
Paper link: arxiv.org/abs/2509.22445
A huge thank you to my collaborators @shahabbakht.bsky.social and Christopher Pack for their guidance on this project. We’d love to hear your thoughts and comments!
The preprint: www.biorxiv.org/content/10.1...
9. We hypothesized that the efficacy of the learning curricula depends on how many distinct, useful visual features the brain recruits to solve the task - curricula which lead learners to rely on fewer, more essential visual features will result in better generalization.
30.09.2025 14:25 — 👍 2 🔁 1 💬 1 📌 05. In this study, we leveraged ANNs to develop a mechanistic predictive theory of learning generalization in humans. Specifically, we wanted to understand the role of **learning curriculum**, and develop a theory of how curriculum affects generalization.
30.09.2025 14:25 — 👍 2 🔁 1 💬 1 📌 0🚨 New preprint alert!
🧠🤖
We propose a theory of how learning curriculum affects generalization through neural population dimensionality. Learning curriculum is a determining factor of neural dimensionality - where you start from determines where you end up.
🧠📈
A 🧵:
tinyurl.com/yr8tawj3
I hope you are doing well. I read your article on the Walrus on the reality of the current state on implementation of recommendations from that TRC. It truly is unfortunate that implementing these recommendations isn't proceeding with alacrity. One are that is confusing for me is the truth around the Kamloops mass grave site. In your article, you state, "discovery of unmarked graves on the grounds of the former Kamloops Indian Residential School". However, follow up work hasn't found any mass graves. I have tried to find primary sources on discovery of actual mass graves without success. Can you please share primary sources on this? I have spoken to others who state that though ground penetrating radar found some suggestions of graves, follow up digging did not find any actual graves. Appreciate any help you can provide. Thank you.
so fucking tiresome to get emails like this whenever I write about residential school history, truly. people who believe that graves don't exist if they can't see the bodies with their own two eyes possess the critical thinking skills of a baby playing peekaboo.
29.09.2025 19:53 — 👍 138 🔁 29 💬 6 📌 1Hanover’s Oktoberfest honouring hip hop’s best
28.09.2025 12:56 — 👍 1 🔁 0 💬 0 📌 0An acrylic painting by self-taught Cree artist Allen Sapp (1928-2015) of a woman, in blue and red, kneeling by a pond, washing clothes, with birch trees around.
'A nice day to wash clothes' | Allen Sapp (Cree) 1928-2015
Private Collection