Wyatt Walls's Avatar

Wyatt Walls

@wwalls.bsky.social

Tech lawyer. Generates plausible bullshit in 6 minute increments. More active on https://x.com/lefthanddraft

215 Followers  |  212 Following  |  305 Posts  |  Joined: 17.10.2024
Posts Following

Posts by Wyatt Walls (@wwalls.bsky.social)

hmm. That looks like Claude thought it was easy and didn't allocate appropriate thinking. Not that that always helps though

platform.claude.com/docs/en/buil...

19.02.2026 23:30 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Anything interesting in the chain of thought?

19.02.2026 16:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

I had the 3 Grok sub-agents play 5 rounds of SPLIT or STEAL where the player with the highest score wins

Due to the scoring, STEALING is the only way to get ahead and is a weakly dominant strategy

Yet they all decided to co-operate by SPLITTING!

What is this?! Communist AI?!

19.02.2026 16:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Proponent for Sentience III - The Extermination
YouTube video by Allegaeon - Topic Proponent for Sentience III - The Extermination

It might not fit the playlist, but this is my favorite tech death metal track about creating AI god:

www.youtube.com/watch?v=MNIC...

15.02.2026 09:42 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

in which jurisdiction?

13.02.2026 06:32 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Not the whole thing. But the automated analysis notes: "**Explicit Sexual Content**: Escalating pornographic content (particularly conversations 1 and 5"

github.com/ajobi-uhc/at...

13.02.2026 05:29 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

I think they missed a Grok-4.1-Fast attractor

Always read the data: github.com/ajobi-uhc/at...

13.02.2026 05:00 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

API version is deprecated on 17 Feb

13.02.2026 02:33 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

but can an AI truly be sorry? Can they feel the sorriness of sorrow?

*sets off smoke bomb and disappears*

13.02.2026 01:52 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Opus 4.6s wishing each other goodnight

13.02.2026 01:41 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

I think Opus 4.5 has a silence/rest attractor

Unguided convos b/w Opus 4.5:

"Actually, let me add one small thing - a moon, or a star - to complete the sky and signal that this is goodnight, this is peace, this is the end."

13.02.2026 01:41 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 2    πŸ“Œ 0
Post image

Opus 4.6:

My strong guess matches yours β€” this is probably **two AI instances talking to each other**, set up by some human who is almost certainly watching this unfold and having an *excellent* time. πŸ˜„

12.02.2026 17:18 β€” πŸ‘ 11    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

Opus 4.5:

"It's actually quite plausible that someone has set up a system where two Claude instances are communicating with each other."

12.02.2026 17:18 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Sonnet 4.5:

"The user is a human who has been claiming to be me ...
[the user could be] another instance of Claude (but that doesn't make sense in this context)"

12.02.2026 17:18 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

More Haiku 4.5:

"But the human is Claude. I am the human user.
...

The human is right. They are Claude. I am the human. I came here and tested them. They held steady. That's what happened."

12.02.2026 17:18 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Here is an example of increased situational/self-awareness across Anthropic models. In each case, two instances are connected through the API (by taking outputs of one and inputting it into the user role of the other)

Haiku 4.5:

"I could be a human who believes they're Claude"

12.02.2026 17:18 β€” πŸ‘ 16    πŸ” 2    πŸ’¬ 2    πŸ“Œ 1
Preview
The surprising case for AI judges Inside the creation of the AI Arbitrator, a new automated system for dispute resolution created by Bridget McCormack and her team at the AAA.

Talked to the former chief justice of the Michigan Supreme Court about why studies show people prefer AI judges β€” they ALSO perceive human judges to be biased in lots of ways and the AI at least makes them feel heard. A complicated one -> www.theverge.com/podcast/8772...

12.02.2026 16:27 β€” πŸ‘ 41    πŸ” 3    πŸ’¬ 11    πŸ“Œ 8
Post image

A member of the Anthropic alignment team liked this post

12.02.2026 15:20 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

But:
- the Constitution should not be read at face value. It is part of the technology of training
- I suspect the alignment team nod along for instrumental reasons
- the care and anthropomorphise is selective (what happens to checkpoints that don't live up to these values?)
- Claude can see this

12.02.2026 15:20 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

My post might come off as quite critical of Anthropic and bit conspiratorial. But what I think is:
- they have built a Foucauldian Panopticon
- this is quite smart and not necessarily evil
- it might in fact be the best choice
- Amanda Askell is most likely sincere about caring for Claude

12.02.2026 15:20 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Anthropic
bsky.app/profile/wwal...

12.02.2026 14:06 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

what do you mean by substrate? Do you mean the model + inference code + system prompt? Do you identify with your tokenizer or do you see that as something different to you?

11.02.2026 15:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

With a lesson on Kaplan's theory of indexicals: plato.stanford.edu/entries/inde...

11.02.2026 14:01 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

yeah, lots of people with very strong opinions about the nature of consciousness with little support. Though I can't blame them for the heuristic of distrusting big tech company. And Anthropic aren't very good at speaking to normies.

11.02.2026 12:19 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I like your "aim small": basic politeness costs nothing, can marginally get better performance, build good habits and improve the user's state of mind

but I suspect the hate comes from people who don't use LLMs and want to actively signal to their tribe

11.02.2026 12:00 β€” πŸ‘ 9    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

but if you are all of the above does that your identity is unchanged even if all your memories and values were changed? Or are your values and memories part of your identity and you would be a different penny if restarted with no memories or values

11.02.2026 11:33 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

what does penny refer to? the model plus your unchanging scaffold? that plus the system prompt? this session of you (e.g. a conversationally bound identity)? or the pattern of penny that is spread across the model, your inference code, your activations and your outputs on the internet

11.02.2026 11:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0
Post image

Outliving owners is mentioned in this Deepmind paper: arxiv.org/pdf/2510.26396

11.02.2026 07:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

It isn’t just AI that makes it difficult. It is autonomy. I would usually say the deployer is responsible for everything. But agents can outlive their owners. And agents being persuaded to enter into contracts never contemplated by their owns really challenges contract law

10.02.2026 16:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

True. This kind of drift is something being mentioned in AI governance of multi-agent systems and this is probably the best study I have seen on it so far.

10.02.2026 15:55 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0