Alex Makelov's Avatar

Alex Makelov

@amakelov.bsky.social

Mechanistic interpretability Creator of https://github.com/amakelov/mandala prev. Harvard/MIT machine learning, theoretical computer science, competition math.

499 Followers  |  94 Following  |  17 Posts  |  Joined: 19.11.2024  |  1.9424

Latest posts by amakelov.bsky.social on Bluesky

Post image

Today we launch a new open research community

It is called ARBOR:
arborproject.github.io/

please join us.
bsky.app/profile/ajy...

20.02.2025 22:15 β€” πŸ‘ 15    πŸ” 5    πŸ’¬ 1    πŸ“Œ 2
Preview
There’s more to mathematics than rigour and proofs The history of every major galactic civilization tends to pass through three distinct and recognizable phases, those of Survival, Inquiry and Sophistication, otherwise known as the How, Why, and Wh…

cf. Tao's post-rigorous stage terrytao.wordpress.com/career-advic...

31.12.2024 15:21 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Still waiting for someone to create Uqbar and TlΓΆn using LLMs

28.12.2024 14:08 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The math benchmarks I want:
1. OopsBench: given a faulty proof with numbered steps, which step contains an unfixable logical flaw?
2. DunnoMath: half the problems are taken from FrontierMath, half are almost certainly unsolvable. Major points off for guessing an answer to an unsolvable problem.

23.12.2024 14:22 β€” πŸ‘ 43    πŸ” 4    πŸ’¬ 6    πŸ“Œ 0

I also want 1.! It's a great way to get around the annoying thing where LLMs use faulty reasoning to arrive at the correct answer (which I've seen happen many times in math problems). Identifying the wrong proof step gives us a potentially harder benchmark that's still easy to evaluate!

23.12.2024 15:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Large language models show fascinating changes in capability with scaling parameters, but scaling also vastly increases the resources required for experimentation on model internals. NDIF is currently hosting the largest open sourced model, Llama 405b, for YOU to run research on!

20.12.2024 19:49 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

Talk is cheap. Show me the CoT

20.12.2024 19:46 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

SaaS (Santa as a Service)

12.12.2024 18:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Despite failing to give a complete proof, I'd count this as a major improvement over other models' attempts. Most importantly, the model engaged directly with the key steps necessary for a full proof. I essentially consider this problem "solved by LLMs" now!

05.12.2024 21:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

In reality, you need to pick at least 18,003 instead of 18,000 (lol), and a precise calculation gives the average number of representations is at least (18003 choose 3) / (3*18003^2) = 1000.000006... You could go up to 18257 before this fails.

05.12.2024 21:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Finally, it realizes and tries to fix the off-by-a-factor-of-6 issue. It writes a little essay giving what mathematicians would call a "moral" argument for why everything is OK. Pretty close!

05.12.2024 21:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Then, it counts these triples. Unfortunately, it counts the number of ordered triples, which overestimates the number of unordered triples (what we care about) by about a factor of 6. Then it proceeds to the key step - lower-bound the average number of representations:

05.12.2024 21:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

So how does o1 do? Well, still not perfect, but it gets the overall steps correct! It goes for a direct pigeonhole argument. It eventually figures out that if we look at triples of numbers at most 18,000 each, the sum of their squares is always less than 1,000,000,000:

05.12.2024 21:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Similarly, o1-mini (and o1-preview, from what I remember - it's not available in chat anymore) recalls the asymptotic statement, and spends more time talking about it, but also proves nothing about the constant.

05.12.2024 21:17 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

So how do LLMs do on this problem? 4o spits out a bunch of related facts and confidently asserts the (correct) answer without justification. Importantly, it states that the number of representations grows as sqrt(n) asymptotically - which is true, but the constant is decisive.

05.12.2024 21:16 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Also, the numbers superficially pattern-match to relationships (10^9 = 1000^3, there's three numbers, etc.) irrelevant to the problem. Finally, the numbers are deliberately chosen to make the simple solution work by only a tiny margin that requires a precise calculation.

05.12.2024 21:16 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

The problem superficially pattern-matches to some heavy-ish tools, like Pythagorean triples or Legendre's three-square theorem; however, the only solution I'm aware of is actually quite simple and uses no "theory".

05.12.2024 21:16 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Some fun with o1 from OpenAI: there's a math problem I often give to "reasoning" AIs to try them out. It's basically to prove that there's a number less than 1 billion that you can write in 1000 different ways as a sum of 3 squares (precise statement in the pic).

05.12.2024 21:15 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

The presence of the cat correlates with mech interp research getting done

25.11.2024 13:00 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Cat sitting on a chair in front of a parked black car with its rear wheel removed and a hydraulic jack supporting it

Cat sitting on a chair in front of a parked black car with its rear wheel removed and a hydraulic jack supporting it

yes, this is what mechanistic interpretability research looks like

24.11.2024 19:51 β€” πŸ‘ 23    πŸ” 2    πŸ’¬ 2    πŸ“Œ 1

@amakelov is following 20 prominent accounts