Nafnlaus ๐Ÿ‡ฎ๐Ÿ‡ธ ๐Ÿ‡บ๐Ÿ‡ฆ's Avatar

Nafnlaus ๐Ÿ‡ฎ๐Ÿ‡ธ ๐Ÿ‡บ๐Ÿ‡ฆ

@nafnlaus.bsky.social

Mastodon: @nafnlaus@fosstodon.org Twitter: @enn_nafnlaus URL: https://softmaxdroptableartists.bandcamp.com/ #Energy #EVs #Ukraine #AI #Horticulture #Research

6,440 Followers  |  740 Following  |  33,174 Posts  |  Joined: 03.05.2023
Posts Following

Posts by Nafnlaus ๐Ÿ‡ฎ๐Ÿ‡ธ ๐Ÿ‡บ๐Ÿ‡ฆ (@nafnlaus.bsky.social)

David was such a wonderful contestant :)

04.03.2026 02:32 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

But I mean, continue to enjoy your poorly-conched vomit-chocolate :)

04.03.2026 01:55 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Badum tss sound effect
YouTube video by Sound Meme Badum tss sound effect

"As a loyal American I will continue to remind all that we make the most and the best chocolate in the world."

www.youtube.com/shorts/7sbbd...

04.03.2026 01:55 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
03.03.2026 23:11 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Claudeโ€™s Cycles

Don Knuth, Stanford Computer Science Department

(28 February 2026; revised 02 March 2026)

Shock! Shock! I learned yesterday that an open problem Iโ€™d been working on for several weeks had just been solved by Claude Opus 4.6 โ€” Anthropicโ€™s hybrid reasoning model that had been released three weeks earlier! It seems that Iโ€™ll have to revise my opinions about โ€œgenerative AIโ€ one of these days. What a joy it is to learn not only that my conjecture has a nice solution but also to celebrate this dramatic advance in automatic deduction and creative problem solving. Iโ€™ll try to tell the story briefly in this note.

Claudeโ€™s Cycles Don Knuth, Stanford Computer Science Department (28 February 2026; revised 02 March 2026) Shock! Shock! I learned yesterday that an open problem Iโ€™d been working on for several weeks had just been solved by Claude Opus 4.6 โ€” Anthropicโ€™s hybrid reasoning model that had been released three weeks earlier! It seems that Iโ€™ll have to revise my opinions about โ€œgenerative AIโ€ one of these days. What a joy it is to learn not only that my conjecture has a nice solution but also to celebrate this dramatic advance in automatic deduction and creative problem solving. Iโ€™ll try to tell the story briefly in this note.

YOOOOOOO fucking KNUTH dropped a lil note on a problem of his being solved w/claude y'alllllll

03.03.2026 22:58 โ€” ๐Ÿ‘ 141    ๐Ÿ” 40    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 8
Post image Post image Post image

False.

03.03.2026 20:51 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Neat, thanks!

03.03.2026 18:10 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

There's a list there somewhere ? I mean, you could browse through every task, but that would take ages.

03.03.2026 17:25 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image 03.03.2026 16:48 โ€” ๐Ÿ‘ 110    ๐Ÿ” 20    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

ChatGPT cancellations were not for nothing. Also, imo you should keep it cancelled, as a signal that itโ€™s not just one thing

03.03.2026 16:25 โ€” ๐Ÿ‘ 47    ๐Ÿ” 4    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 1

ah, OpenAI is entirely stopping DoW deployment for now

that was not clear to me from samaโ€™s post. also, iโ€™m very glad to see Noam getting directly involved in policy. i realize heโ€™s just a researcher, but itโ€™s great to have important people deeply invested in this

03.03.2026 16:24 โ€” ๐Ÿ‘ 133    ๐Ÿ” 23    ๐Ÿ’ฌ 14    ๐Ÿ“Œ 6

If there's anything that warrants a bug report about our species, it's "tens of thousands of people burning themselves to death rather than have to hear the liturgy read from a book that spells Jesus's name with an extra 'i' ".

03.03.2026 16:39 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

(You can test how good a model is at these sorts of things with "needle in a haystack" challenges - you insert a "needle" (some random piece of info or unexpected sentence) in a huge work, and then ask them about the needle)

03.03.2026 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

a couple early references to the hero secretly wishing he was a pickle, and then at the end of the book add a scene where someone shows up with a jar of pickles, ask it to write the next paragraph, and it'll have him enviously wishing he was among them. It's not at all just "recent words/sentences".

03.03.2026 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

"... piece of work" - no, they don't actually do that. That's not how LLMs work, and in fact, the fact that they don't do that is *the key differentiating characteristic* of LLMs vs. earlier models. Modern models are so good that you could input an entire fantasy book and scatter in...

03.03.2026 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

sustain their claims about LLM vs. human writing in reality. And I must stress "blind" because as soon as you tell someone something was written by an AI (or they even suspect it), it deeply colors their rating of it. And to reiterate:

" LLMs write every sentence/paragraph as if it was its own..."

03.03.2026 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

... to be dominated by a recency-bias in the content they were processing, being unable to process a whole long context at once (imagine trying to think about every part of a book at the same time!).

As for the linked Twitter thread, I'd bet on 10 to 1 odds that a blind comparison wouldn't...

03.03.2026 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Pure LLMs are limited by the lack of a "scratchpad" for internal reasoning traces, but LRMs (most models today) have them, and unsurprisingly, perform much better.

It is the attention mechanism that led to the big leap forward in AI performance. Earlier models without attention mechanisms tended...

03.03.2026 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

If a LLM is writing a book, it's not looking at just the last few words, it has the entire book it has written thusfar in its context, and the attention mechanism gives it access to the whole thing. It very much does have the "larger purpose" on-hand. And they very much plan ahead.

03.03.2026 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

As for some specifics in your essay, I have some objections.

"But this is precisely how LLMs are trained and evaluated: on next-token prediction, the ability to produce a plausible-sounding sentence regardless of a larger purpose."

This pretends LLMs don't have a context.

03.03.2026 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Perhaps surprisingly, LLMs have the same weaknesses we do in this regard. Like us, they're not good at just "being random". Randomness is faked, it's thought-out. While examining their their hidden states could allow for determining what's truly random/unexpected, they - like us - can't just do that

03.03.2026 16:16 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Because ultimately, it's prediction error that forms the basis for learning. Your brain constantly predicts what every sense will experience. Your ear will sense pitch X, because they'll say word Y, because they'll talk about topic Z, etc.

No error? No learning.

03.03.2026 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

From this basis, it's only natural that we'd enjoy flowery language and novel metaphor. It's out of the ordinary enough to not bore. It's not so out of the ordinary that we get confused and frustrated.

It's just a side effect of being beings evolutionarily-tuned to want to learn new things.

03.03.2026 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

It's to the point that people get good at predicting comedy, you can invert it and cause the break in their predictions by *not* throwing in what's expected. Monty Python was famous for this - you set them up for a joke, the audience sees the joke coming, then you instead don't deliver the joke.

03.03.2026 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Much of comedy (some argue all) is driven by this at its base. You set the person on a path where all their predictions are matching up and then suddenly reveal that they were led astray early on and have to backtrack. "My friend recently got a hair extension, so now her house looks weird."

03.03.2026 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Honestly, I think both the essay and this miss the underlying mechanisms.

We, as humans, enjoy our expectations being broken by just the right amount. Our hunger to learn is a balance between "everything matching our predictions perfectly" (boredom) and "too much not matching" (confusion).

03.03.2026 16:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Not very well converted trucks... I guess they're presuming they're being watched with low resolution cameras...

03.03.2026 15:30 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

bsky.app/profile/nafn...

03.03.2026 14:55 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
03.03.2026 14:55 โ€” ๐Ÿ‘ 4    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

Alex has a Whatsapp chat for the different versions to chat, but it doesn't work by Alex dictating to them what to do. Rather, he offers support as needed, but otherwise lets them be creative, and only "filters" things he finds against the spirit of the show.

03.03.2026 14:30 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0