Mike Hearn @mikehearn - Bluesky Profile

On the Biology of a Large Language Model We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology.

Somehow every BlueSky poster knows how LLMs work, meanwhile Anthropic researchers are releasing 35k-word papers meticulously analyzing the internals and still concluding that they don't really know how they work.

transformer-circuits.pub/2025/attribu...

08.08.2025 14:34 — 👍 82 🔁 7 💬 5 📌 3

Tricking LLMs with the "counting letters" prompt is like showing humans an optical illusion and then, when the human perceives it incorrectly, using it as evidence that humans aren't intelligent. It targets a specific blind spot in how we operate but isn't really representative of anything else.

08.08.2025 12:42 — 👍 4 🔁 0 💬 0 📌 1

This is a failure of the new GPT-5 router more than anything. We know reasoning models can correctly answer this question 100% of the time, the router just isn't sophisticated enough to understand that this question, while superficially simple, actually requires reasoning. It's a fixable problem.

08.08.2025 12:36 — 👍 0 🔁 0 💬 0 📌 0

Ok, apparently they considered that and decided (correctly) that wasting effort on this narrow and manufactured problem wasn’t worth it. Gotta just accept the online dunks from people that know enough to trick the LLM but not enough to understand why the trick works. bsky.app/profile/schm...

08.08.2025 12:16 — 👍 2 🔁 0 💬 0 📌 0

OpenAI should just automatically enable thinking for this dumb question that only exists trick LLMs.

08.08.2025 12:12 — 👍 1 🔁 0 💬 1 📌 0

I like ChatGPT.

25.07.2025 15:19 — 👍 1 🔁 0 💬 1 📌 0

I love that this hypothetical guy immediately made a terrible financial decision on his rent payments. I agree with you, this guy's gonna have a hard time.

01.07.2025 16:51 — 👍 2 🔁 0 💬 1 📌 0

I dislike Scott Adams as a person, his opinions, etc. but I did watch his announcement (the first thing of his I've ever seen) and he was pretty clear that he tried it in the course of leaving no stone unturned. He said he & his dr didn't think it would work, but there were no downsides, so why not.

20.05.2025 01:39 — 👍 2 🔁 0 💬 1 📌 0

This is awesome.

18.05.2025 21:10 — 👍 8 🔁 0 💬 0 📌 0

On the Biology of a Large Language Model We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology.

By "inside" I mean the billions (trillions?) of parameters, activations, attention patterns etc. that are poked and prodded in interpretability studies. No one fully understands how those things work together to produce the model outputs. transformer-circuits.pub/2025/attribu...

14.05.2025 02:40 — 👍 4 🔁 0 💬 1 📌 0

On the Biology of a Large Language Model We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology.

If you have a perfect understanding of how LLMs work, you should contact the authors of this paper, tell them you have the answers, and collect your millions from the AI lab of your choosing. transformer-circuits.pub/2025/attribu...

14.05.2025 01:59 — 👍 0 🔁 0 💬 1 📌 0

I feel like we don't have a perfect understanding of what happens inside LLMs and we also don't have a perfect definition of what thinking means, so I guess I am less confident about this than you are.

14.05.2025 00:34 — 👍 3 🔁 0 💬 6 📌 0

I get the argument that this ruling is potentially beneficial, but I think the idea of every app now having an Apple price and a non-Apple price, with different payment flows for each, is ultimately going to end up as net-negative for everyone (users, devs, Apple).

01.05.2025 02:00 — 👍 3 🔁 0 💬 0 📌 0

This is like if you had a human assistant named Steve, and you said, "Steve, can you write an email in my voice," and then Steve did it, and you got furious at Steve for impersonating you.

23.04.2025 05:50 — 👍 1 🔁 0 💬 0 📌 0

Does it still count as impersonation when she asked ChatGPT to impersonate her?

23.04.2025 05:25 — 👍 0 🔁 0 💬 0 📌 0

A lot of angry and upset people in this thread, but almost no one seems to understand the specifics of what they're angry about. This reporter asked ChatGPT to write about something in her own voice, and it did (privately, just to her). WaPo has absolutely nothing to do with this.

23.04.2025 05:15 — 👍 10 🔁 1 💬 0 📌 1

I just want to note that you can give ChatGPT a prompt with literally any name -- real names, fake names, silly names, serious names -- and it will do the exact same thing. Here's an excerpt in the style of extremely not-real WaPo reporter Barnabas Flimflamington. This outrage over this is silly.

23.04.2025 05:02 — 👍 10 🔁 0 💬 1 📌 0

I asked ChatGPT to write a WaPo story in the style of Mike Hearn, and it did, with my name as the byline. I have never written for WaPo (or anywhere). This is what ChatGPT does, because it's essentially what I asked it to do. This whole thread and the various reactions are wild and kind of insane.

23.04.2025 04:48 — 👍 3 🔁 0 💬 1 📌 0

I feel like people are misunderstanding what this is. Sora.com already has a homepage feed with "likes"; once they add following and comments, it's a social app.

16.04.2025 13:02 — 👍 1 🔁 0 💬 0 📌 0

Here are other screenshots that are closer in tone to today's. It's a thing that he does. bsky.app/profile/adis...

15.04.2025 03:16 — 👍 1 🔁 0 💬 2 📌 0

It's awkward to find his third-person tweets because searching "Yglesias" brings up, you know, all his tweets. But if you search "yglesias third person" you can get a litany of people dragging him for using the third-person.

15.04.2025 02:07 — 👍 16 🔁 0 💬 2 📌 2

The timeline of replies to the 3rd-person post is fascinating. It was made 24 hrs ago, so there are a handful of normal replies also made 24 hrs ago from people who understood the post in context, then the screenshot went viral about 6 hours ago, and the rest are just insane from that point on.

15.04.2025 01:55 — 👍 8 🔁 0 💬 1 📌 0

A good trick here is that, on the iPhone, you can hit the power button 5 times in quick succession. It brings up the "Slide to Power Off" screen, disables FaceID/TouchID and requires your full passcode to unlock the phone again.

08.04.2025 01:47 — 👍 10 🔁 0 💬 0 📌 0

Ah didn’t realize that was a thing. Makes sense.

06.04.2025 19:17 — 👍 0 🔁 0 💬 0 📌 0

The lead photo of that post is almost certainly AI, for what it’s worth. I can’t speak to the details of the story itself.

06.04.2025 19:08 — 👍 10 🔁 0 💬 1 📌 0

Isn't the assumed reason they're capitulating because they believe they will lose money (clients) if they are in a fight with the admin?

03.04.2025 13:58 — 👍 0 🔁 0 💬 0 📌 0

I can't recognize it either, because I personally don't see the villain in this. The "or whatever" is the acknowledgement of the open-ended possibilities of this theoretical super intelligence. Curing cancer is the headliner of "problems it might solve" and then there's an infinitely long tail.

26.03.2025 20:28 — 👍 1 🔁 0 💬 4 📌 0

I'm as cynical as the next guy but the idea that he's running some kind of long con to make money using his insider influence is way more far-fetched than just assuming he bought Microsoft because it's Microsoft. He's not trading penny stocks over here, MSFT is top 3 in market cap.

11.03.2025 12:56 — 👍 1 🔁 0 💬 0 📌 0

That's... a weird hallucination. What was your prompt? Did it think you were referring to someone else?

18.02.2025 01:30 — 👍 0 🔁 0 💬 1 📌 0

I think the, uh, political logic, such as it is, is that his vote wouldn't change the result and maybe it gives him some conservative bona fides in a purple state? Manchin obviously did this a lot and it was deeply annoying but it got us a dem senator in WV. Now, PA isn't WV, so, yeah, I dunno.

05.02.2025 02:10 — 👍 0 🔁 0 💬 0 📌 0

Mike Hearn

Latest posts by mikehearn.bsky.social on Bluesky

@mikehearn is following 18 prominent accounts