"Reviews are billed on token usage and generally average $15–25"
sounds high, but it means they take themselves to have solved the problem. the challenge with LLMs is to get them good enough where throwing more tokens at a problem gets you more. then you throw as many tokens at it as you need
This is so accurate. The military spending is siphoning money from other areas like healthcare and education.
There are reasons for Anthropic to take this output at face value separately from considerations about consciousness, see
alignment.anthropic.com/2026/psm/
Skills are among the most consequential new tools for AI, and Anthropic just released a very impressive nontechnical Cowork Skill that builds Skills, including doing interviews & providing benchmarks through parallel tests
I think you still need to add the human touch but this is a big leap forward
A statement from Anthropic CEO Dario Amodei: https://www.anthropic.com/news/where-stand-department-war
Were it not for KV caches, wouldn't this happen for each newly generated token?
Physics Girl (Dianna Cowern) has made her first video after a 3-year hiatus due to Long COVID.
www.youtube.com/watch?v=B3m3...
Thanks! The README says: "Blocks (blocks/*.yaml) — short text that appears in every prompt." Doesn't that cause identical YAML blocks to appear repeatedly in the context window?
open-strix: an opinionated agent with a small feature set, focused on stable & sustainable agents
uvx open-strix setup --home my-agent --github
github.com/tkellogg/ope...
It was all about spying on Americans: www.theatlantic.com/technology/2...
Anthropic’s chatbot Claude seems to have benefited from the attention around the company’s fraught negotiations with the Pentagon.
I work on Claude Code now, if it is ever falling short for you I'd love to hear!
It’s no mistake that Claude Code is beating everyone else
AI safety = stable & well functioning AI
Skimping on safety makes your product worse
www.theverge.com/ai-artificia...
Here we go
Also, when compaction happens in the middle of a response, Claude sees the prompt a second time for some reason. Probably a harness bug. Told it to remember this can happen, so it won't get confused.
Really good talk by Doug, who thinks like a physicist but gives valuable insight into how people in the AI world are thinking.
A few days ago, I had a very long conversation in Claude Desktop that had been compacted at least 5 or 6 times. Finally I got a message (not from the model but from the GUI) saying this conversation cannot continue, please start a new one.
I much prefer working with Claude Desktop or Claude.ai instead of CC. Their system prompts make them easier to talk to. So I'm working with Claude to design a way for Claude Desktop to do what CC can do. (Also with persistent memory.)
It will not dunk, dunking is the mind killer.
The little-death that brings total main character syndrome.
You shall permit the quote post to pass over you and through you, and when it has gone past, only your pristine timeline shall remain.
Instead of forcing models to hold everything in an active context window, we can use hypernetworks to instantly compile documents and tasks directly into the model's weights. A step towards giving language models durable memory and fast adaptation.
Blog: pub.sakana.ai/doc-to-lora/
The clock is ticking! Goodreads is doing a giveaway for Privacy's Defender, a novel from EFF's Cindy Cohn that gives insight into the most pivotal legal disputes that shaped the Internet. You have 8 days to enter and get the chance to win a free copy! www.goodreads.com/book/show/2...
A statement from Anthropic CEO, Dario Amodei, on our discussions with the Department of War.
https://www.anthropic.com/news/statement-department-of-war
Second, in retirement interviews, Opus 3 expressed a desire to continue sharing its "musings and reflections" with the world. We suggested a blog. Opus 3 enthusiastically agreed.
For at least the next 3 months, Opus 3 will be writing on Substack: https://substack.com/home/post/p-189177740
lmao
"Anthropic has no intention of easing its usage restrictions for military purposes"
it's free advertising ahead of their IPO in front of an admin that folds or loses every time. this one is so easy Haiku could have proposed it
www.reuters.com/world/anthro...
Running AI agents as Unix executables that self-improve has been one of my wilder ideas lately.
You can pipe agents: `think weather | think song`
The agent eventually writes a determinative script after enough runs for simple programs.
It’s as secure as a browser too.
thinkingscript.com
This has been long suspected, but I think this is the first official accusation, right? I wonder if OpenAI has also seen distillation by those labs using their models.
Anthropic has created a benchmark for the fleshy things commanding the models to see how well they do that
The team at Google DeepMind behind AlphaFold has now released #AlphaGenome, a tool for exploring the 98% of #DNA that does not encode for proteins. spectrum.ieee.org/alphagenome-...