Vincent Conitzer's Avatar

Vincent Conitzer

@conitzer.bsky.social

AI professor. Director, Foundations of Cooperative AI Lab at Carnegie Mellon. Head of Technical AI Engagement, Institute for Ethics in AI (Oxford). Author, "Moral AI - And How We Get There." https://www.cs.cmu.edu/~conitzer/

1,331 Followers  |  540 Following  |  217 Posts  |  Joined: 05.05.2024  |  2.1034

Latest posts by conitzer.bsky.social on Bluesky

Post image

sorry Italy

01.08.2025 19:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

continued

31.07.2025 20:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Continuing the donut hole theme, this seems to miss something important.

31.07.2025 20:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Donut holes don't exist as traversable spaces in real life.

30.07.2025 18:50 β€” πŸ‘ 79    πŸ” 15    πŸ’¬ 6    πŸ“Œ 1

thank you @melaniemitchell.bsky.social!

30.07.2025 06:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

trash can disposal pro tip

29.07.2025 20:54 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

"how many 1 dollar bills can i stack between two 20 dollar bills" -- I'm not sure AI Overview understands the word "therefore"

28.07.2025 20:52 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
From the LocalLLaMA community on Reddit Explore this post and more from the LocalLLaMA community

Notable BTW that the first source it gives is a discussion of what LLMs in the past answered for a similar question! www.reddit.com/r/LocalLLaMA...

27.07.2025 20:32 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

"is it possible to swim in coffee?" -- apparently it is not because coffee is a liquid, not a solid, and you want a solid, swimmable medium.

26.07.2025 19:32 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

(2/2) continued

25.07.2025 21:01 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

(1/2) more logic (based on relevant facts)

25.07.2025 21:01 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Vincent Conitzer | 75 Years of Nash Equilibrium, Oxford
YouTube video by Maison FranΓ§aise d'Oxford Vincent Conitzer | 75 Years of Nash Equilibrium, Oxford

The Nash75 talks are on YouTube! Below is my talk "Game Theory for AI Agents" (link also gives all other talks on the side).
www.youtube.com/watch?v=WO5x...

24.07.2025 10:18 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

It's likely that Google could release quite a bit more information about this by releasing the chain of thought. My chain of thought is harder to release :-) but started on the strategies, realizing A would want to postpone & B would want to spread out; then for what lambda does A fall behind.

23.07.2025 21:24 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

-

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

(16/n, n=16) we have lambda > sqrt(2)/2 + epsilon, so that A's next move can be up to lambda(2m+1)-m*sqrt(2) > lambda + 2m*epsilon. For some sufficiently large m, (m*epsilon)^2 > 2m+2 and hence playing this move will win the game for A, proving (b).

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(15/n) round, by convexity.] This means that A can make a move of up to lambda(2m+1)-m*sqrt(2) >= lambda >= 0, so A won't get in trouble the next move; this means A will never lose, proving the other half of (c). Moreover, if lambda > sqrt(2)/2, then for some epsilon>0,

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(14/n) Suppose lambda >= sqrt(2)/2.
With the wait-for-the-kill strategy, if after 2m rounds the game is still active, then the sum of the numbers is at most m*sqrt(2). [This is because A will have always played 0, and B would have made the sum of the numbers the highest by choosing sqrt(2) every

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(13/n) Moreover, if lambda < sqrt(2)/2, then for some epsilon>0, we have lambda < sqrt(2)/2 - epsilon, so that A's next move can be at most (2m+1)lambda - m*sqrt(2) < lambda - 2m*epsilon, and at some point this will be negative, so B must win eventually, proving (a).

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(12/n) if for every i, the numbers played in rounds 2i+1 and 2i+1 are 0 and sqrt(2), by convexity.] It follows that A's next move can be at most (2m+1)lambda - m*sqrt(2) <= lambda which means B won't get in trouble the next move; hence, this means B will never lose, proving half of (c).

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(11/n) Proof:
Suppose lambda <= sqrt(2)/2.
With the fill'er-up strategy, if after 2m rounds the game is still active, then the sum of squares so far is 2m and the sum of the numbers is at least m*sqrt(2). [This is because what would make the sum of the numbers the lowest is

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(10/n) (c) If lambda = sqrt(2)/2 then B can keep A from winning by playing fill'er up and A can keep B from winning by playing wait for the kill.

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(9/n) For B: "Fill'er up" -- always choose the largest number possible.

Theorem:
(a) If lambda is strictly below sqrt(2)/2 then B will win by playing fill'er up.

(b) If lambda is strictly above sqrt(2)/2 then A will win by playing wait for the kill.

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(8/n) Intuition: A wants to "concentrate" the value of the numbers, whereas B wants to spread it out as much as possible. Thus we will consider the following two strategies:
For A: "Wait for the kill" -- always choose 0 unless and until there is a choice that makes the next move for B impossible.

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(7/n) In particular GDT's doesn't *look* like any kind of raw search power proof.

My own solution, done before looking at anyone or anything else's, follows. (If you like proofs, and games, but also asymptotic analysis, you might enjoy doing it yourself.)

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(6/n) and I wave my hands a bit more. I added a bit of intuition at the beginning but many human mathematicians wouldn't. While one could probably still tell mine is the human one and GDT's is the computer one in a number of ways, the difference isn't as extreme or meaningful as I expected.

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(5/n) Mine's a bit shorter but it seems not in any particularly meaningful way, I just wanted to write/repeat less and write less algebra (so I didn't work out the exact round in which something happens; I was also too lazy to check GDT's algebra but I take it someone did),

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(4/n) We should devote a lot of resources now to preparing for and steering the various consequences, of all kinds.)

Anyone else with a similar analysis of the other problems? Where are OpenAI's solutions?

More about solution comparison:

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(3/n) But if everything is above board it definitely gets me to think (even) higher of the capabilities of these systems. (No worries, I will continue posting funny examples, but the point of those was never that these systems aren't impressive or aren't quickly getting better.

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(2/n) Below is my proof (hope I didn't mess it up :-)) and GDT's is here: storage.googleapis.com/deepmind-med...

I'm impressed. The basic thinking ("thinking"?) behind the proofs seems the same. More about the comparison below.

23.07.2025 06:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(1/n) I noticed that on the IMO where several LLM-based systems just got gold medals, the fifth problem (the last one solved by Gemini Deep Think) was a game problem, so I had no excuse :-) not to try it myself first and then see how the solutions compared.

23.07.2025 06:11 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

@conitzer is following 20 prominent accounts