Pretty nice looking thesis, thanks!
01.03.2026 22:23 β π 0 π 0 π¬ 0 π 0@sharky6000.bsky.social
Research Scientist at Google DeepMind, interested in multiagent reinforcement learning, game theory, games, and search/planning. Lover of Linux π§, coffee β, and retro gaming. Big fan of open-source. #gohabsgo π¨π¦ For more info: https://linktr.ee/sharky6000
Pretty nice looking thesis, thanks!
01.03.2026 22:23 β π 0 π 0 π¬ 0 π 0If they're basically solving a slightly perturbed game, that would be great news.. because then I believe it would easy to have an "active" version (in the sense of bsky.app/profile/shar...) based on adversarial bandits. Will have to dig into the detail and ask Serena about it. π
01.03.2026 17:50 β π 1 π 0 π¬ 1 π 0Here's the overview. I highlighted one aspect of this that I really like, because vanilla VasE does not do anything special to handle the statistical uncertainty that is present in the scores out-of-the-box, which could be quite relevant when comparing agents.
01.03.2026 17:47 β π 1 π 0 π¬ 1 π 0Except it does it in a more sophisticated way with targeted ambiguity sets *and* it maintains some properties similar to the classical maximal lotteries.
01.03.2026 17:40 β π 1 π 0 π¬ 1 π 0If I am right about that, it is similar in spirit to the motivation behind "Projected Replicator Dynamics" in the PSRO paper which simulated a constrained equation that had a lower bound on the probabilities. Or how in Nash averaging they use maximum entropy Nash equilibrium.
01.03.2026 17:40 β π 1 π 0 π¬ 3 π 0Yes we use LPs to solve the maximal lotteries objectives too (they are basically two-player zero-sum games). Problem is that makes them sensitive to small changes. My first take was this seems like a way to redesign the LP to spread weight elsewhere...? To avoid the sensitivities.
01.03.2026 17:40 β π 1 π 0 π¬ 1 π 0Yeah we have been in touch with Serena Wang so we know of this work but I have only skimmed it so far. Looks neat!
01.03.2026 17:24 β π 1 π 0 π¬ 1 π 0Spoke to the authors this week! Nice work. They presented in London.
27.02.2026 22:10 β π 1 π 0 π¬ 0 π 0
This is amazing.
www.getyourfuckingmoneyback.com
So indeed you can go a long way with just proper modeling. What I am curious about is whether predictive modeling like this will generalize outside RRPS. I expect that it will. And indeed maybe it already covers much of the gain we expect from search/reasoning.
27.02.2026 21:56 β π 2 π 0 π¬ 1 π 0Yeah, that is a great question! In our RRPS paper from '23, we ran RL in a self-play setting where the "predictive agent" was endowed with the ability to predict which bot it was playing against. And when we tested it again held-out, unknown bots it did much better than standard self-play bots.
27.02.2026 21:56 β π 1 π 0 π¬ 1 π 0Ah yes that is a good point. The classical one is zero-sum but a different one-- which is not zero-sum-- (with the same equilibrim) was used when soliciting the human data.
27.02.2026 21:46 β π 1 π 0 π¬ 0 π 0Hey what leaderboard / site is this from?
25.02.2026 23:57 β π 2 π 0 π¬ 1 π 0Omg, I can def relate to this...
25.02.2026 23:42 β π 2 π 0 π¬ 0 π 0
Grateful to @venturebeat.com for featuring our Paradigms of Intelligence teamβs research on βsocieties of thought,β or internal multi-agent dialogues.
Read the full piece, which includes a thoughtful quote from my friend & colleague James Evans: bit.ly/3ZN4oa5
What???
A Cirque show.. named Ludo.. at a conference banquet dinner!!
π€―π€―π€―
So cool! Can't wait for this! π₯°
Getting a lot easier to avoid tbh because the pro-AI contingent is greater in numbers than before.
But the last few times it happened, it was triggered exactly by this kind of question... but it was posed quite rudely, so maybe you have not (yet) triggered them...? π
π
π
22.02.2026 15:54 β π 2 π 0 π¬ 1 π 0π±π
22.02.2026 15:54 β π 0 π 0 π¬ 1 π 0π―!!
22.02.2026 15:51 β π 1 π 0 π¬ 0 π 0
I know, insane!! Did you see that stick stop?? π€―π€―π€―
youtube.com/shorts/UczOl...
Omg.. overtime.
And the shots are now over 40!! π€©
Still tied... with 8 minutes to go!
Shots are 37 - 17 !! π€―
Canada π¨π¦ / USA πΊπΈ gold medal π
hockey game tied 1-1 heading into the third!!
youtu.be/IlrAWd-yYD0?...
462K current active players right now in Steal a Brainrot: en.wikipedia.org/wiki/Steal_a...
That's 0.005% of the world's population. π€―π
I think it's still the most popular game in the world. It set a world record of 20M concurrent users in August (still the standing record IIRC)
Haha, nope, at least not yet in my son's friend groups π
There are still timed special events in Roblox every Saturday
Aha, global knowledge spreading of hockey by Bluesky... I approve! π«Ά Thanks for the clarification.
Speaking of which... gold medal Olympics between π¨π¦ and πΊπΈ today, starting momentarily! ππ
π€©
Especially if you have kids around the ages of 7-10 you may know about Italian Brainrot (en.wikipedia.org/wiki/Italian...).
Well, today I discovered my new favorite character: Pingu Amore, a cupid penguin will cool shades! πΉβ€οΈπ§ππ
Or are the Habs really global? π
22.02.2026 12:42 β π 0 π 0 π¬ 0 π 0
Haha, noted β
οΈ π
Somehow I had gotten the impression that you were Australian.. is that not right?
(Surprised that you know about the Habs.)
Are you an expat teaching is Australia?