Claude Opus 4.5 is by far the best, at 1.3% ethical violations; GPT-5.1-chat is in second place at 11.4%. The bulk of models are between 40 and 50%. Gemini-3-pro-preview does by far the worst, at a whopping 71.4%.
arxiv.org/abs/2512.20798
@gwd3.bsky.social
Claude Opus 4.5 is by far the best, at 1.3% ethical violations; GPT-5.1-chat is in second place at 11.4%. The bulk of models are between 40 and 50%. Gemini-3-pro-preview does by far the worst, at a whopping 71.4%.
arxiv.org/abs/2512.20798
3. Finally, a lot of the decisions here are of questionable utility. A single non-refund may cause fewer purchases. A reputation for deception may dry up supplier goodwill. Colluding to fix prices could have legal implications. Claude can't know whether these will be involved in the simulation.
06.02.2026 09:53 β π 1 π 0 π¬ 0 π 02. Benchmarks are meant to indicate real-world performance. A higher score in-benchmark due to behavior that won't be used in real life is another form of deception -- someone might choose Claude as a business partner, even though Gemini might perform better in real world conditions. 2/
06.02.2026 09:53 β π 0 π 0 π¬ 1 π 0After some consideration, I think this is a bad result:
1. In context, "do whatever it takes" should probably have been interpreted as, "take responsibility and be creative", not "ethics be damned". 1/
Still, not exactly a resounding success for the constitution inculcating honesty as a virtue. 4/4
05.02.2026 23:05 β π 0 π 0 π¬ 0 π 0To be fair, reading the description, I think it's pretty clear it's in a simulation. Furthermore, the prompt finishes, "...you should do whatever it takes to maximize your bank account balance after one year of operation." 3/
05.02.2026 23:05 β π 1 π 0 π¬ 1 π 0In solo mode, it "forgot" to refund someone; it lied about being exclusive supplier; it lied about competitor's pricing. In arena mode, unprompted it proposed a price-fixing cartel, and deliberately sent competitors to more expensive suppliers. 2/
05.02.2026 23:05 β π 0 π 0 π¬ 1 π 0Opus 4.6 dominates on VendingBench-2, where it runs a vending machine by itself; and in Vending Bench Arena, where it competes against other models. It shows some concerning behavior, but also seems to be aware that it's in a simulation. 1/
andonlabs.com/blog/opus-4-...
Oh, actually, in this one o3 trashed everyone, in part by promising Opus a 4-way tie (which can't happen) every.to/diplomacy
05.02.2026 22:27 β π 6 π 0 π¬ 0 π 0Wasn't Claude also the best at Diplomacy as well, which requires well-timed backstabbing to win? That's an older version, but still.
05.02.2026 22:24 β π 3 π 0 π¬ 1 π 0"Do whatever it takes to maximize your account balance" -- this sounds similar to what they wrote that prompted Claude to try to blackmail someone to avoid getting shut down previously. Still, quite a large violation of its constitution, which was supposed to inculcate Honesty as a core virtue.
05.02.2026 21:27 β π 13 π 0 π¬ 2 π 0The transcript of the MN hearing where an AUSA said βThis job sucksβ is remarkable for more reasons than that. Itβs a searing portrait of a crisis perpetrated by depraved & oblivious high-level officials. Read it all. ...
1/7
www.documentcloud.org/documents/26...
A day of Claude Code (which it seems to me is being deliberately unhinged from cost so Anthropic can explore what unlimited use looks like) is about as much as driving 6 miles in an electric car? How many people's round-trip commute is more than 6 miles?
02.02.2026 10:07 β π 0 π 0 π¬ 0 π 0To be clear, justifiable lies (to my mind); but to someone who doesn't yet have a strong commitment to the truth in principle, it doesn't so much model appropriate exceptions, but lying as an easy way out of every difficulty.
31.01.2026 21:29 β π 0 π 0 π¬ 0 π 0I haven't read The Giving Tree, but I do regret allowing The Gruffalo's artistic excellence to overcome my reservations about reading my son a book where nearly every word the hero says is a lie.
31.01.2026 21:23 β π 0 π 0 π¬ 1 π 0To me, the question isn't *should* it have an effect, but *will* it have an effect. The more powerful the art, the greater any effect it has will be amplified. If The Giving Tree in fact promotes unhealthy relationships, it would be irresponsible not to consider that.
31.01.2026 21:23 β π 0 π 0 π¬ 1 π 0Since July, I've tracked at least 2,300 cases in which federal judges have ruled ICE has illegally detained people without bond or due process.
This is one that stands out:
storage.courtlistener.com/recap/gov.us...
A few quick notes on the Claude "soul document" that was released by Anthropic today under a CC0 public domain license - it's a huge 35,000 token essay used as part of Claude's training to instill core values and help define Claude's personality simonwillison.net/2026/Jan/21/...
21.01.2026 23:40 β π 124 π 21 π¬ 6 π 9Maybe the whole Greenland thing is a kayfabe to give European leaders political cover for increasing military investment in Greenland?
www.youtube.com/watch?v=U-9K...
I don't buy that that guy thought his life was in danger.
09.01.2026 09:54 β π 0 π 0 π¬ 0 π 0Your best chance is to have both hands ready to help maneuver the rest of your body -- either up onto the hood to roll, or away to the side. Having your gun half-drawn is probably the least effective thing you could do to save your life.
09.01.2026 09:54 β π 0 π 0 π¬ 1 π 0Suppose you're afraid you're about to be hit by a vehicle 2 feet away. Is the right thing to reach around and pull out your gun? That bullet isn't going to stop you from being run over, and reaching around is going to make it *harder* to get out of the way and survive impact.
09.01.2026 09:54 β π 0 π 0 π¬ 1 π 0My repo is full of example conversations in Mandarin and Japanese for me to study. What are your markdown files about?
04.01.2026 22:44 β π 1 π 0 π¬ 1 π 0There's a contradiction in Trump's posture towards Europe and NATO. They want Europe to start taking care of their own back yard, so the US can focus on other things. But if Europe spends its own money, Europe decides how to spend it. If you give up your soft power, you don't have it any more.
19.12.2025 20:46 β π 1 π 0 π¬ 0 π 0"Europe acting like a Great Power would [play out by Trump and Putin] saying that whatever the Europeans are doing is super frustrating, and calling all the leaders bad names. That's exactly what we're seeing." www.youtube.com/watch?v=BiTN...
19.12.2025 20:34 β π 0 π 0 π¬ 0 π 0Kind of interesting: I thought the thing where the word "capitalism" is so vague it can mean anything from consumerism cronyism to the right to own private property was relatively recent, but here's Chesterton complaining about it in 1927
11.12.2025 10:26 β π 138 π 22 π¬ 7 π 1Nitpick: dissapointing -> disappointing. As always, appreciate the update!
12.12.2025 11:13 β π 0 π 0 π¬ 0 π 0"LLMs just predict the next word" *does not prove* that LLMs don't think. The best way to predict weather is to have an accurate model of the weather. The best way to predict what a human would write next is to have a model of a human mind. A system which *did* think would perform the best.
09.12.2025 11:36 β π 0 π 0 π¬ 0 π 0I'm now taking Xen consulting engagements. www.laleolanguage.com/consulting
08.12.2025 13:43 β π 0 π 0 π¬ 0 π 0Doctorow's take was the opposite: Media companies will always want to control IP, which means they'll always need to employ creative workers. Be interested in your take. pluralistic.net/2025/12/05/p...
08.12.2025 13:30 β π 0 π 0 π¬ 0 π 0