We Made Top AI Models Compete in a Game of Diplomacy. Here’s Who Won.
The models that did the best learned to lie, deceive, and betray their fellow players
When Claude, Gemini, o3 battle each other for world domination in Diplomacy:
DeepSeek turned warmongering tyrant. Claude wouldn't lie—everyone exploited it ruthlessly. o3 orchestrated a secret coalition, backstabbed every ally, and won.
tx textArena.ai and friends
every.to/p/diplomacy
🧠🤖
#AI #LLM
09.06.2025 19:06 — 👍 15 🔁 7 💬 2 📌 0
Just published in JOSS: 'GCol: A High-Performance Python Library for Graph Colouring' https://doi.org/10.21105/joss.07871
02.04.2025 11:56 — 👍 2 🔁 1 💬 0 📌 0
RL researcher at DeepMind
https://schaul.site44.com/ 🇱🇺
I work at Sakana AI 🐟🐠🐡 → @sakanaai.bsky.social
https://sakana.ai/careers
Researcher @MSFTResearch; Prof @UWMadison (on leave); learning in context; thinking about reasoning; babas of Inez Lily.
https://papail.io
LOGML (London Geometry and Machine Learning) aims to bring together mathematicians and computer scientists to collaborate on a variety of problems at the intersection of geometry and machine learning.
🥇 LLMs together (co-created model merging, BabyLM, textArena.ai)
🥈 Spreading science over hype in #ML & #NLP
Proud shareLM💬 Donor
@IBMResearch & @MIT_CSAIL
Anti-cynic. Towards a weirder future. Reinforcement Learning, Autonomous Vehicles, transportation systems, the works. Asst. Prof at NYU
https://emerge-lab.github.io
https://www.admonymous.co/eugenevinitsky
Group Leader, Generative AI | NeurIPS 2024 Program Chair | Principal Scientist & Director | Founder of Amsterdam AI Solutions
Principal Scientist at Naver Labs Europe, Lead of Spatial AI team. AI for Robotics, Computer Vision, Machine Learning. Austrian in France. https://chriswolfvision.github.io/www/
ML Professor at École Polytechnique. Python open source developer. Co-creator/maintainer of POT, SKADA. https://remi.flamary.com/
Cheminformatics, ML, Drug Discovery
The Journal of Open Source Software is a developer friendly, diamond open access journal for research software packages.
Committed to publishing quality research software with zero article processing charges or subscription fees.
https://joss.theoj.org/
ML/AI researcher & former stats professor turned LLM research engineer. Author of "Build a Large Language Model From Scratch" (https://amzn.to/4fqvn0D). Blogging about AI research at magazine.sebastianraschka.com.
International Conference on Learning Representations https://iclr.cc/
AI and Neuroscience, Assistant Professor at CSHL
Assistant Professor at Ecole Polytechnique, IP_Paris// Before: Oxford_VGG, Inria Grenoble // multimodality, genAI enthusiast // happy mum+dog_mum // opinions: mine
Professor of Computer Vision/Machine Learning at Imagine/LIGM, École nationale des Ponts et Chaussées @ecoledesponts.bsky.social Music & overall happiness 🌳🪻 Born well below 350ppm
📍Paris 🔗 https://davidpicard.github.io/
Mathematician/Computer Scientist interested in discrete and computational geometry and topology. Working at University of Basel and ETH Zürich.
https://people.inf.ethz.ch/schnpatr/
The Institut Polytechnique de Paris is a world-class Institute of science and technology.
https://www.ip-paris.fr/en
Assistant Prof of CS at the University of Waterloo, Faculty and Canada CIFAR AI Chair at the Vector Institute. Joining NYU Courant in September 2026. Co-EiC of TMLR. My group is The Salon. Privacy, robustness, machine learning.
http://www.gautamkamath.com