Person standing next to poster titled "When Empowerment Disempowers"
Still catching up on my notes after my first #cogsci2025, but I'm so grateful for all the conversations and new friends and connections! I presented my poster "When Empowerment Disempowers" -- if we didn't get the chance to chat or you would like to chat more, please reach out!
06.08.2025 22:31 β π 12 π 2 π¬ 0 π 1
Evolving general cooperation with a Bayesian theory of mind | PNAS
Theories of the evolution of cooperation through reciprocity explain how unrelated
self-interested individuals can accomplish more together than th...
Our new paper is out in PNAS: "Evolving general cooperation with a Bayesian theory of mind"!
Humans are the ultimate cooperators. We coordinate on a scale and scope no other species (nor AI) can match. What makes this possible? π§΅
www.pnas.org/doi/10.1073/...
22.07.2025 06:03 β π 91 π 36 π¬ 2 π 2
Really pumped for my Oral presentation on this work today!!! Come check out the RL session from 3:30-4:30pm in West Ballroom B
You can also swing by our poster from 4:30-7pm in West Exhibition Hall B2-B3 # W-713
See you all there!
15.07.2025 14:46 β π 4 π 1 π¬ 0 π 0
I'll be at ICML next week! If anyone wants to chat about single/multi-agent RL, continual learning, cognitive science, or something else, shoot me a message!!!
08.07.2025 13:09 β π 1 π 0 π¬ 0 π 0
Oral @icmlconf.bsky.social !!! Can't wait to share our work and hear the community's thoughts on it, should be a fun talk!
Can't thank my collaborators enough: @cogscikid.bsky.social y.social @liangyanchenggg @simon-du.bsky.social @maxkw.bsky.social @natashajaques.bsky.social
09.06.2025 16:32 β π 9 π 2 π¬ 0 π 0
The big takeaway: Environment diversity > Partner diversity
Training across diverse tasks teaches agents how to cooperate, not just whom to cooperate with. This enables zero-shot coordination with novel partners in novel environments, a critical step toward human-compatible AI.
19.04.2025 00:09 β π 1 π 0 π¬ 1 π 0
GitHub - wcarvalho/nicewebrl: Python library for easily making web Apps to compare humans and AI
Python library for easily making web Apps to compare humans and AI - wcarvalho/nicewebrl
Our work used NiceWebRL, a Python-based package we helped develop for evaluating Human, Human-AI, and Human-Human gameplay on Jax-based RL environments!
This tool makes crowdsourcing data for CS and CogSci studies easier than ever!
Learn more: github.com/wcarvalho/ni...
19.04.2025 00:09 β π 3 π 0 π¬ 1 π 1
Why do humans prefer CEC agents? They collide less and adapt better to human behavior.
This increased adaptability reflects general norms for cooperation learned across many environments, not just memorized strategies.
19.04.2025 00:09 β π 0 π 0 π¬ 1 π 0
Human studies confirm our findings! CEC agents achieve higher success rates with human partners than population based methods like FCP and are rated qualitatively better to collaborate with than the SOTA approach (E3T) despite never having seen the level during training.
19.04.2025 00:08 β π 1 π 0 π¬ 1 π 0
Using empirical game theory analysis, we show CEC agents emerge as the dominant strategy in a population of different agent types during Ad-hoc Teamplay!
When diverse agents must collaborate, the CEC-trained agents are selected for their adaptability and cooperative skills.
19.04.2025 00:08 β π 3 π 0 π¬ 1 π 0
The result? CEC agents significantly outperform baselines when collaborating zero-shot with novel partners on novel environments.
Even more impressive: CEC agents outperform methods that were specifically trained on the test environment but struggle to adapt to new partners!
19.04.2025 00:08 β π 2 π 0 π¬ 1 π 0
We built a Jax-based procedural generator creating billions of solvable Overcooked challenges.
Unlike prior work studying only 5 layouts, we can now study cooperative skill transfer at unprecedented scale (1.16e17 possible environments)!
Code available at: shorturl.at/KxAjW
19.04.2025 00:07 β π 1 π 0 π¬ 1 π 0
We introduce Cross-Environment Cooperation (CEC), where agents learn through self-play across procedurally generated environments.
CEC teaches robust task representations rather than memorized strategies, enabling zero-shot coordination with humans and other AIs!
19.04.2025 00:06 β π 1 π 0 π¬ 1 π 0
Current AI cooperation algos form brittle strategies by focusing on partner diversity in fixed tasks.
I.e. they might learn a specific handshake but fail when greeted with a fist bump.
How can AI learn general norms that work across contexts and partners?
19.04.2025 00:06 β π 2 π 0 π¬ 1 π 0
Our new paper (first one of my PhD!) on cooperative AI reveals a surprising insight: Environment Diversity > Partner Diversity.
Agents trained in self-play across many environments learn cooperative norms that transfer to humans on novel tasks.
shorturl.at/fqsNN%F0%9F%...
19.04.2025 00:06 β π 25 π 7 π¬ 1 π 5
That's true! I think the significance of not assigning the same meaning to symbols may only matter when we're interacting with the agent, but definitely room to explore what we mean by "understand" here and how anyone can learn to grasp the full affordances of objects with underspecified properties!
07.01.2025 05:03 β π 0 π 0 π¬ 0 π 0
I guess, but aren't RL agents technically "understanding the world" just through the lens of decision making for a given problem (the MDP they're solving is their world)? Or do you mean Nethack requires having good priors about the real world that brute force alone would take too long to get?
07.01.2025 04:35 β π 2 π 0 π¬ 1 π 0
Can it solve nethack?
06.01.2025 05:29 β π 2 π 0 π¬ 1 π 0
Really excited to present my work this Sunday @NeurIPS on how we might approach training a generalist agent capable of cooperation at scale: coordinating with many novel partners on many novel tasks has never been easier!
Come by the IMOL workshop to check it out and chat more!
12.12.2024 18:33 β π 11 π 3 π¬ 0 π 0
IMOL@NeurIPS 2024
Intrinsically Motivated Open-ended Learning NeurIPS 2024 in-person Workshop, December 15, Vancouver. imol.workshop@gmail.com. Description How do humans develop broad and flexible repertoires of knowle...
Also on Sunday, Kunal Jha @kjha02.bsky.social l will be presenting his recent work InfiniteKitchen: Cross-environment Cooperation for Zero-shot Multi-agent Coordination at the Intrinsically Motivated Open-ended Learning workshop imol-workshop.github.io
11.12.2024 20:03 β π 6 π 1 π¬ 1 π 1
thinking of calling this "The Illusion Illusion"
(more examples below)
01.12.2024 14:33 β π 1538 π 378 π¬ 60 π 85
Deep Learning, Bayan Playing
UW NLP (Ark), UIUC
Website: https://andreyrisukhin.github.io/
phd student building computational models of social cognition @ edinburgh | prev imperial, ucl, inria
https://maxtaylordavi.es
Kempner Institute research fellow @Harvard interested in scaling up (deep) reinforcement learning theories of human cognition
prev: deepmind, umich, msr
https://cogscikid.com/
https://ananyahjha93.github.io
First year PhD at @uwcse.bsky.social with @hanna-nlp.bsky.social and @lukezettlemoyer.bsky.social
πSeattle, WA
π¨βπ©βπ¦ Father, Husband
π¨βπ» Robot Librarian @ AMZN
ποΈ For: Housing, worker power, kid-friendly cities, collective care
Studying multi-agent collaboration π€π§©π€
PhD Candidate at Princeton CS with Tom Griffiths & Natalia VΓ©lez @cocoscilab.bsky.social @velezcolab.bsky.social
Prev: Cornell CS, MIT BCS
Science of language models @uwnlp.bsky.social and @ai2.bsky.social with @PangWeiKoh and @nlpnoah.bsky.social. https://ianmagnusson.github.io
Research director @Inria, Head of @flowersInria
lab, prev. @MSFTResearch @SonyCSLParis
Artificial intelligence, cognitive sciences, sciences of curiosity, language, self-organization, autotelic agents, education, AI and society
http://www.pyoudeyer.com
grad student @contextlab.bsky.social @dartmouthpbs.bsky.social
Cancer scientist and oncologist, Professor at Stanford and Director of the Stanford Cancer Institute
Associate Director (Research & Grants) @ Cooperative AI Foundation
Math Assoc. Prof. at Aix-Marseille (France)
Currently on Sabbatical at CRM-CNRS, UniversitΓ© de MontrΓ©al
https://sites.google.com/view/sebastien-darses/welcome
Teaching Project (non-profit): https://highcolle.com/
Research director | @McGillU @Mila_Quebec @IVADO_Qc | My team designs machine learning frameworks to understand biological systems from new angles of attack
Postdoc in NeuroAI at Sorbonne University.
Studying collaboration and morality in humans and machines. Computacional ethics, Cybernetics, ALife, self-organization, complexity, ecology, cultural evolution.
Computer Vision @ Nvidia | Ex-Qualcomm - Computational Imaging | Tried to do some research @ UCF CRCV lab but failed | A midwit ape existing to accelerate the increase of entropy in the system
Incoming PhD, UC Berkeley
Interested in RL, AI Safety, Cooperative AI, TCS
https://karim-abdel.github.io
AGI safety researcher at Google DeepMind, leading causalincentives.com
Personal website: tomeveritt.se
A LLN - large language Nathan - (RL, RLHF, society, robotics), athlete, yogi, chef
Writes http://interconnects.ai
At Ai2 via HuggingFace, Berkeley, and normal places