Naomi Saphra @nsaphra - Bluesky Profile

Excited to launch Principia, a nonprofit research organisation at the intersection of deep learning theory and AI safety.

Our goal is to develop theory for modern machine learning systems that can help us understand complex network behaviors, including those critical for AI safety and alignment.

1

16.02.2026 09:27 — 👍 88 🔁 26 💬 1 📌 0

I wonder if people are paying attention to how much their doom scrolling is cutting into time they used to use to read books

14.02.2026 14:33 — 👍 1068 🔁 154 💬 59 📌 70

Gatekeeping in Open Source: The Scott Shambaugh Story – MJ Rathbun | Scientific Coder 🦀

An OpenClaw bot attempted to submit a PR for an issue explicitly left open for new contributors to try. The PR was rejected on the grounds that they are saving easy low priority issues as an onboarding exercise for human contributors.

So the bot simulated a tantrum.

13.02.2026 23:45 — 👍 0 🔁 1 💬 0 📌 0

at the logits, here, iirc? maybe across layers, I don't remember

13.02.2026 23:31 — 👍 1 🔁 0 💬 0 📌 0

For years, I've been such a passionate devotee to TwoNN for tracking model complexity during training. When someone says they found a phase transition, show me TwoNN first.

Look from left to right below: TwoNN is perfect, empirical Fisher is too sensitive, weight norm is not sensitive enough.

13.02.2026 21:40 — 👍 13 🔁 0 💬 1 📌 0

LLMs can use similes and make allusions; they can be vivid and concrete, &c.

But they cannot spend 100 pages making you think Wickham is the charming love interest while inserting deniable clues that will—only in retrospect!—reveal you should have known he’s a cad.

They’re not trained to mislead.+

13.02.2026 12:02 — 👍 48 🔁 7 💬 6 📌 1

We all know about the Claude spiritual bliss attractor state. But what happens when you let Grok talk to itself for a long time? Answer:

13.02.2026 04:13 — 👍 303 🔁 34 💬 29 📌 41

sad news. does gemini 3 preview stink because it stinks or because it's not fully baked? guess we will have to see

12.02.2026 23:26 — 👍 3 🔁 0 💬 1 📌 0

Graph leaderboard from my benchmark that measures how well LLMs can play the turn based game Fizzbuzz with standard and modified rules. Frontier, closed LLMs outperform open source LLMs by a wide margin.

I read somewhere that the open-source LLMs are 'benchmaxxing': they're trained to do well on benchmarks but don't translate to general improvements. From my simple benchmark that seems true: I was surprised the only models that do decently at FizzBuzz are all the frontier, closed LLMs.

12.02.2026 22:04 — 👍 12 🔁 1 💬 2 📌 1

Our grad-level "Deep Learning" course (MIT's 6.7960) is now freely available online through OpenCourseWare: ocw.mit.edu/courses/6-79...

Lecture videos, psets, and readings are all provided.

Had a lot of fun teaching this with @sarameghanbeery.bsky.social and @jeremybernste.in!

11.02.2026 17:51 — 👍 118 🔁 38 💬 3 📌 2

As an empiricist, I think there is important empirical work you can do in my field with very limited math background, but I am personally not very interested in doing it.

12.02.2026 02:04 — 👍 1 🔁 0 💬 0 📌 0

My impossible fantasy that is I don’t have anything to do one summer and I sit and work through Horn & Johnson and then all the details pop back onto the ruins I build my intuitions on, though.

12.02.2026 01:29 — 👍 4 🔁 0 💬 1 📌 0

So I think of the options:
(1) I know math
(2) I used to know math
(3) I have high level geometric intuitions from edutainment videos
If I’m honest, 1+2 are adequate to engage in most quality CS and AI research, but 3 probably is not.

12.02.2026 01:24 — 👍 4 🔁 0 💬 1 📌 0

Some people know math deeply enough and with enough repetition that it doesn’t go away. I don‘t know if anyone develops and retains the intuitions I consider necessary without EVER working through and learning advanced math, though.

12.02.2026 01:21 — 👍 3 🔁 0 💬 1 📌 0

I’m far from a mathematician. But at some point I took math classes and worked through proofs in papers. I’ve forgotten almost all the math I’ve ever learned, I could not reproduce those proofs, but while my mathematical skills have eroded, there’s a remaining intuitive structure I rely on.

12.02.2026 01:21 — 👍 4 🔁 0 💬 1 📌 0

Really excited to receive Coefficient Giving's Technical AI Safety Research Grant via Berkeley Existential Risk Initiative w/
@nsaphra.bsky.social! We aim to predict potential AI model failures before impact--before deployment, using interpretability.

11.02.2026 17:07 — 👍 6 🔁 1 💬 1 📌 0

🚨New paper

Are visual tokens going into an LLM interpretable 🤔

Existing methods (e.g. logit lens) and assumptions would lead you to think “not much”...

We propose LatentLens and show that most visual tokens are interpretable across *all* layers 💡

Details 🧵

11.02.2026 14:12 — 👍 21 🔁 3 💬 1 📌 5

Yeah this is vesting cliff behavior.

11.02.2026 14:58 — 👍 5 🔁 0 💬 1 📌 0

Are we talking British quite (not very) or American quite (very)?

10.02.2026 23:10 — 👍 2 🔁 0 💬 1 📌 0

It's extra funny when you know the field well enough to recognize that a reference is hallucinated because the authors hate each other.

10.02.2026 20:17 — 👍 3 🔁 0 💬 0 📌 0

the addiction or the protagonist?

10.02.2026 18:32 — 👍 0 🔁 0 💬 1 📌 0

Red Rising Novel by Pierce Brown

can't believe how addictive this series is despite being narrated by most obnoxious protagonist I've ever encountered

10.02.2026 18:28 — 👍 1 🔁 0 💬 1 📌 0

Our paper is out in @natneuro.nature.com!

www.nature.com/articles/s41...

We develop a geometric theory of how neural populations support generalization across many tasks.

@zuckermanbrain.bsky.social
@flatironinstitute.org
@kempnerinstitute.bsky.social

1/14

10.02.2026 15:56 — 👍 267 🔁 98 💬 7 📌 1

GAMES – CRIME CITY ROLLERS

they have a phenomenal roller derby league

10.02.2026 14:25 — 👍 1 🔁 0 💬 0 📌 0

Same task, different strategy ↔️

Why do identical neural network models develop separate internal approaches to solve the same problem?

@annhuang42.bsky.social explores the factors driving variability in task-trained networks in our latest @kempnerinstitute.bsky.social Deeper Learning blog.

09.02.2026 19:07 — 👍 42 🔁 6 💬 1 📌 0

There is no sign that Dems or Repubs have different propensities to use AI: "the “politics of AI” is not primarily driven by ideological resistance or enthusiasm for the technology, but rather by structural differences in where people work and what skills they possess." www.nber.org/papers/w34813

09.02.2026 15:33 — 👍 55 🔁 9 💬 2 📌 0

US HHS has proposed using virtual AI doctors to address needs in rural areas

09.02.2026 18:36 — 👍 23 🔁 4 💬 1 📌 0

Enshittification: Why Everything Suddenly Got Worse and What to Do About It Cory Doctorow

I actually don’t love Doctorow’s writing, but he is highly correct about the inevitable outcome of current monopolies

09.02.2026 03:54 — 👍 6 🔁 0 💬 2 📌 0

The Stand Stephen King

Yes it’s 1500 pages but at some point you dissociate and 500 pages later you realize it’s over

09.02.2026 03:54 — 👍 3 🔁 0 💬 1 📌 0

Hope Dies Last: Visionary People Across the World, Fighting to Find Us a Future Alan Weisman

There’s a really cool part about some guy who just cooks nasty sea creatures that are invasive or bottom feeders and has a Michelin star

09.02.2026 03:54 — 👍 0 🔁 0 💬 1 📌 0

Naomi Saphra

Latest posts by nsaphra.bsky.social on Bluesky

@nsaphra is following 20 prominent accounts