4/ The one difference - all of the checks now have an extra zero (or two)
“Even young PhDs could pull a half million dollars a year”
Ilya was offered “nearly $2 million for the first year” in 2016
Some things do change!
@corrywang.bsky.social
Compute @ Anthropic | Formerly AI strategy @ Google and tech equity research @ Bernstein Research
4/ The one difference - all of the checks now have an extra zero (or two)
“Even young PhDs could pull a half million dollars a year”
Ilya was offered “nearly $2 million for the first year” in 2016
Some things do change!
3/ Another thing that remains unchanged after a decade and a half - Mark Zuckerberg is extremely uninterested in AI safety, ie why DeepMind refused to sell out to Facebook
11.11.2025 16:43 — 👍 2 🔁 1 💬 1 📌 02/ It is uncanny how much Meta’s efforts in the last year have just been a rerun of Mark’s original effort to build FAIR by hiring Yann Lecun back in 2013
- Invitations to 1-on-1 dinners with Mark
- A desk next to Mark at the office
- “Intelligent agents” booking airline tickets
- “Open source”
1/4 If I had a nickel for every time Mark Zuckerberg blew a few billion dollars trying to hire a team of star researchers to build a second place frontier AI research lab, I’d have two nickels. Which isn’t a lot, but it’s weird that it happened twice
11.11.2025 16:43 — 👍 0 🔁 0 💬 1 📌 012/ In that vein, Genius Makers is the apotheosis of the "stamp collecting" era of ML research
Let's hope we're finally entering the physics era
11/ A friend recently told me a hilarious quote from Ernest Rutherford: "All science is either physics or stamp collecting"
If you can't derive some underlying principle of reality from your research, then ultimately you're just collecting a bunch of random facts to no end
10/ Page 72 literally mentions Andrew Ng showing a literal line graph to Larry Page in 2010 charting how multimodal ML performance would consistently improve with the application of more training data
But the book never *quite* brings it all together
9/ Page 70 mentions an eyebrow-raising anecdote about how one of Geoff Hinton's grad students discovered at Google in 2011 that training a speech model with 2000 hours of data instead of 12 hours would miraculously improve error rates
10.11.2025 05:07 — 👍 0 🔁 0 💬 1 📌 08/ The result is that the most uncanny part of reading Genius Makers is when you see the ghost of scaling laws looming on the edges
On Page 50, the book mentions Terry Sejnowski's 1987 NETtalk paper, which arguably plotted out the world's first log-linear AI scaling law
7/ But back in the late 2010s, I think it's pretty clear that Cade Metz was still documenting a pre-paradigmatic science, with no consensus on the 'grand unifying theory' that would drive ML research forward
10.11.2025 05:07 — 👍 1 🔁 0 💬 1 📌 06/ Today, I think it's probably fair to say frontier AI research is on the cusp of becoming a "normal science" for the first time, as the field has coalesced around the scaling laws paradigm that ML performance can improve predictably with the application of more compute
10.11.2025 05:07 — 👍 0 🔁 0 💬 1 📌 05/ "Normal" science = developed fields where researchers try to solve puzzles within an established paradigm
"Pre-paradigmatic" science = emerging fields with competing schools of thought, no consensus, and disagreement on what problems are even worth studying in the first place
4/ But as much as I like to make fun of NYT journalists, I think there's a more fundamental explanation for where the book went wrong - that goes back to the philosopher Thomas Kuhn's old distinction between "pre-paradigmatic" science and "normal" science
10.11.2025 05:07 — 👍 0 🔁 0 💬 1 📌 03/ The result? The book spends hundreds of pages talking about: AlphaGo, GANs, LSTMs, RL agents playing DOTA...
Zero mentions of: GPT-2
In fact, there's only a *single* mention of the word "transformer" in the entire 300 page body of the book (a one-off reference to BERT)
2/ The simple explanation is that Genius Makers is a history book that ends right before all the crazy stuff happens
The book was published in March 2021, meaning the final draft was probably finished in the summer of 2020. GPT-3 came out in May 2020
1/ Last week I finally got around to reading Genius Makers - this was Cade Metz's 2021 book on the history of ML
It was really fascinating, but in the same way you'd be fascinated reading a history of Newtonian physics published 3 months before Einstein invented relativity
Basically success for a startup is finding the intersection in the Venn diagram between “sounds crazy to any rational person” and “actually works.” The problem is most of the time you’re just, yknow, actually crazy
09.11.2025 22:05 — 👍 0 🔁 0 💬 0 📌 0Something that I didn’t really appreciate until I left Big Tech is that for a startup to succeed, you really have to be doing stuff that sounds crazy all the time
If it didn’t sound crazy to an L8 at Google, then Google would’ve done it already!
From 1840 to 1850, private Britons cumulatively invested 40% of British GDP into the country’s first rail network. For reference, the equivalent today would be the tech industry spending like, $10 trillion dollars on a single thing
Anyways it’s confirmed, guess we’re all doing this again guys
I don't think Americans realize that outside the US, you can now just buy Ozempic online for $150/month. This will ultimately fall to <$50/month
This actually might've ended up as the important thing in global society in the 2020s, it weren't for the whole, yknow, AI thing
In 1978, AT&T launched the US's first modern cell service in Chicago. The nationwide launch was scheduled for the early 80s, but never happened because AT&T was broken up for antitrust violations in 1982
Predicting the future is easy. Making money is hard
There's a famous anecdote about the invention of the cellphone: in 1981 McKinsey estimated it'd have a TAM of <1M people, so AT&T exited the market
Turns out this anecdote is made up. AT&T's marketing team did claim this, but the engineers just ignored them and launched anyways
Like, every single part of this sentence is wrong?? Inference on a 500M parameter model requires 1 billion flops, which is not 1000 flops, which is also not 1 tflop (that's a trillion flops)
LLMs are actually fairly good at explaining how they work these days... try asking them!
I sometimes wonder these days what % of equity research is just written by ChatGPT. But then I see UBS publish a paragraph like this and realize I'm still getting 100% authentic human content
06.07.2025 23:50 — 👍 6 🔁 0 💬 1 📌 0It's quite striking that despite everything that's happened in AI over the last 3 years, the world is still spending *less* capex building semiconductor foundries today than in 2022
All of AI is still small enough to be washed away by consumers buying -10% fewer Android phones
I will say that anecdotally when I was at Google, the handful of folks I met at Waymo were not particularly scaling-inclined. Hence the urgency
19.06.2025 13:14 — 👍 1 🔁 0 💬 0 📌 07/ I’ve never been that impressed by Tesla FSD compared to Waymo. But if Waymo’s own paper is right, then we could be on the cusp of a “GPT-3 moment” in AV where the tables suddenly turn overnight
The best time for Waymo to act was 5 years ago. The next best time is today!
6/ In contrast to Waymo, it’s clear Tesla has now internalized the bitter lesson
They threw out their legacy AV software stack a few years ago, built a 10x larger training GPU cluster than Waymo, and have 1000x more cars on the road collecting training data today
5/ If the same thing is true in AV, this basically obviates the lead that Waymo has been building in the industry since the 2010s. All a competitor needs to do is buy 10x more GPUs and collect 10x more data, and you can leapfrog a decade of accumulated manual engineering effort
19.06.2025 12:42 — 👍 2 🔁 0 💬 1 📌 04/ The bitter lesson in LLMs post 2019 was that finetuning tiny models on bespoke edge cases was a waste of time. GPT-3 proved if you just to train a 100x bigger model on 100x more data with 10,000x more compute, all the problems would more or less solve themselves!
19.06.2025 12:42 — 👍 1 🔁 0 💬 1 📌 0