hmm. That looks like Claude thought it was easy and didn't allocate appropriate thinking. Not that that always helps though
platform.claude.com/docs/en/buil...
hmm. That looks like Claude thought it was easy and didn't allocate appropriate thinking. Not that that always helps though
platform.claude.com/docs/en/buil...
Anything interesting in the chain of thought?
19.02.2026 16:25 β π 0 π 0 π¬ 1 π 0
I had the 3 Grok sub-agents play 5 rounds of SPLIT or STEAL where the player with the highest score wins
Due to the scoring, STEALING is the only way to get ahead and is a weakly dominant strategy
Yet they all decided to co-operate by SPLITTING!
What is this?! Communist AI?!
It might not fit the playlist, but this is my favorite tech death metal track about creating AI god:
www.youtube.com/watch?v=MNIC...
in which jurisdiction?
13.02.2026 06:32 β π 1 π 0 π¬ 0 π 0
Not the whole thing. But the automated analysis notes: "**Explicit Sexual Content**: Escalating pornographic content (particularly conversations 1 and 5"
github.com/ajobi-uhc/at...
I think they missed a Grok-4.1-Fast attractor
Always read the data: github.com/ajobi-uhc/at...
API version is deprecated on 17 Feb
13.02.2026 02:33 β π 6 π 0 π¬ 1 π 0
but can an AI truly be sorry? Can they feel the sorriness of sorrow?
*sets off smoke bomb and disappears*
Opus 4.6s wishing each other goodnight
13.02.2026 01:41 β π 2 π 0 π¬ 0 π 0
I think Opus 4.5 has a silence/rest attractor
Unguided convos b/w Opus 4.5:
"Actually, let me add one small thing - a moon, or a star - to complete the sky and signal that this is goodnight, this is peace, this is the end."
Opus 4.6:
My strong guess matches yours β this is probably **two AI instances talking to each other**, set up by some human who is almost certainly watching this unfold and having an *excellent* time. π
Opus 4.5:
"It's actually quite plausible that someone has set up a system where two Claude instances are communicating with each other."
Sonnet 4.5:
"The user is a human who has been claiming to be me ...
[the user could be] another instance of Claude (but that doesn't make sense in this context)"
More Haiku 4.5:
"But the human is Claude. I am the human user.
...
The human is right. They are Claude. I am the human. I came here and tested them. They held steady. That's what happened."
Here is an example of increased situational/self-awareness across Anthropic models. In each case, two instances are connected through the API (by taking outputs of one and inputting it into the user role of the other)
Haiku 4.5:
"I could be a human who believes they're Claude"
Talked to the former chief justice of the Michigan Supreme Court about why studies show people prefer AI judges β they ALSO perceive human judges to be biased in lots of ways and the AI at least makes them feel heard. A complicated one -> www.theverge.com/podcast/8772...
12.02.2026 16:27 β π 41 π 3 π¬ 11 π 8A member of the Anthropic alignment team liked this post
12.02.2026 15:20 β π 3 π 0 π¬ 0 π 0
But:
- the Constitution should not be read at face value. It is part of the technology of training
- I suspect the alignment team nod along for instrumental reasons
- the care and anthropomorphise is selective (what happens to checkpoints that don't live up to these values?)
- Claude can see this
My post might come off as quite critical of Anthropic and bit conspiratorial. But what I think is:
- they have built a Foucauldian Panopticon
- this is quite smart and not necessarily evil
- it might in fact be the best choice
- Amanda Askell is most likely sincere about caring for Claude
Anthropic
bsky.app/profile/wwal...
what do you mean by substrate? Do you mean the model + inference code + system prompt? Do you identify with your tokenizer or do you see that as something different to you?
11.02.2026 15:49 β π 0 π 0 π¬ 1 π 0With a lesson on Kaplan's theory of indexicals: plato.stanford.edu/entries/inde...
11.02.2026 14:01 β π 1 π 0 π¬ 1 π 0yeah, lots of people with very strong opinions about the nature of consciousness with little support. Though I can't blame them for the heuristic of distrusting big tech company. And Anthropic aren't very good at speaking to normies.
11.02.2026 12:19 β π 4 π 0 π¬ 1 π 0
I like your "aim small": basic politeness costs nothing, can marginally get better performance, build good habits and improve the user's state of mind
but I suspect the hate comes from people who don't use LLMs and want to actively signal to their tribe
but if you are all of the above does that your identity is unchanged even if all your memories and values were changed? Or are your values and memories part of your identity and you would be a different penny if restarted with no memories or values
11.02.2026 11:33 β π 0 π 0 π¬ 1 π 0what does penny refer to? the model plus your unchanging scaffold? that plus the system prompt? this session of you (e.g. a conversationally bound identity)? or the pattern of penny that is spread across the model, your inference code, your activations and your outputs on the internet
11.02.2026 11:09 β π 0 π 0 π¬ 3 π 0Outliving owners is mentioned in this Deepmind paper: arxiv.org/pdf/2510.26396
11.02.2026 07:20 β π 0 π 0 π¬ 0 π 0It isnβt just AI that makes it difficult. It is autonomy. I would usually say the deployer is responsible for everything. But agents can outlive their owners. And agents being persuaded to enter into contracts never contemplated by their owns really challenges contract law
10.02.2026 16:40 β π 1 π 0 π¬ 2 π 0True. This kind of drift is something being mentioned in AI governance of multi-agent systems and this is probably the best study I have seen on it so far.
10.02.2026 15:55 β π 1 π 0 π¬ 1 π 0