Sam Harsimony's Avatar

Sam Harsimony

@harsimony.bsky.social

I write about opportunities in science, space, and policy here: https://splittinginfinity.substack.com/

1,017 Followers  |  516 Following  |  1,729 Posts  |  Joined: 18.11.2024
Posts Following

Posts by Sam Harsimony (@harsimony.bsky.social)

Ok it’s computationally intensive but

mino.mobi/cluster

Calculates the largest group of your follows who all follow each other. Your densest subgraph.

And then lets you publish as a list (which I hope to refer to in further analysis)

Here’s mine bsky.app/profile/did:...

28.02.2026 17:59 β€” πŸ‘ 53    πŸ” 5    πŸ’¬ 8    πŸ“Œ 7

TIL that Signal uses deniable authentication of messages AKA Off-The Record messaging.

That means that while you can verify that a message came from someone, you can't credibly share that message with anyone else. Cool!!

28.02.2026 19:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
An Unconstitutional War Trump's attack on Iran is obviously unconstitutional. The moral and policy issues are a closer call.

This war is blatantly unconstitutional - undeniably a large enough military action to require congressional authorization. Its wisdom and morality are closer calls. But I'm skeptical regime change (which would be good) can be achieved by air attack alone: reason.com/volokh/2026/...

28.02.2026 18:37 β€” πŸ‘ 24    πŸ” 12    πŸ’¬ 1    πŸ“Œ 2
Preview
Here's to the Polypropylene Makers Six years ago, as covid-19 was rapidly spreading through the US, my sister was working as a medical resident. One day she was handed an N95 and told to

www.jefftk.com/p/heres-to-t...

28.02.2026 15:59 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I assume this wasn't intentional, but the latter half of this reads as casting aspersions on someone based on their sexual interests.

That goes against my value of sex-positivity and I'd like to see less of it in the future.

28.02.2026 15:55 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The human alignment problem has surpassed the AI alignment problem in importance.

27.02.2026 23:07 β€” πŸ‘ 25    πŸ” 4    πŸ’¬ 2    πŸ“Œ 1
Post image Post image

New paper on a long-shot I've been obsessed with for a year:

How much are AI reasoning gains confounded by expanding the training corpus 10000x? How much LLM performance is down to "shallow" generalisation (approximate pattern-matching to highly-related training data)?

t.co/CH2vP0Y7OF

27.02.2026 17:25 β€” πŸ‘ 62    πŸ” 16    πŸ’¬ 1    πŸ“Œ 2

Hmm croque madame might get near the dark breakfast section

27.02.2026 16:25 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I feel the milk corner should include dairy more generally so we can discuss things like croissants, mousse, buttered toast etc.

27.02.2026 16:22 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

Well, math terminology being what it is, something like this was bound to happen eventually.

(If you're curious about why these balls are so puny, the full talk is up on YouTube)

27.02.2026 15:45 β€” πŸ‘ 167    πŸ” 27    πŸ’¬ 4    πŸ“Œ 4

Hell yeah. I continue to be LoRA-pilled.

bsky.app/profile/hars...

27.02.2026 15:22 β€” πŸ‘ 11    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Funny. Also interesting because the deployment of laser defenses very important to the future of war.

27.02.2026 14:53 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

www.gleech.org/ai2025

27.02.2026 10:53 β€” πŸ‘ 10    πŸ” 1    πŸ’¬ 0    πŸ“Œ 1

New preprint with LΓ©o Pio-Lopez:
www.preprints.org/manuscript/2...

"Multi-Scale Longevity: Defeating Aging from Cells to Embodied Human Minds, and the Future of the Species"

a broader view of longevity research.

27.02.2026 14:09 β€” πŸ‘ 23    πŸ” 6    πŸ’¬ 0    πŸ“Œ 1

Instead of forcing models to hold everything in an active context window, we can use hypernetworks to instantly compile documents and tasks directly into the model's weights. A step towards giving language models durable memory and fast adaptation.

Blog: pub.sakana.ai/doc-to-lora/

27.02.2026 04:36 β€” πŸ‘ 104    πŸ” 14    πŸ’¬ 2    πŸ“Œ 4
Preview
Statement from Dario Amodei on our discussions with the Department of War A statement from our CEO on national security uses of AI

A statement from Anthropic CEO, Dario Amodei, on our discussions with the Department of War.

https://www.anthropic.com/news/statement-department-of-war

26.02.2026 22:36 β€” πŸ‘ 124    πŸ” 38    πŸ’¬ 11    πŸ“Œ 31

Oh, thought of one more thing: Cardio on a regular basis (~3x/week or as needed) can help with anxiety and get your baseline energy levels up too.

27.02.2026 03:01 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Things to consider:
- Melatonin on occasion
- Sleep apnea
- Burnout and stress can come from feeling unsupported at work. Taking breaks and rest doesn't really address the core problem.

You got this!

27.02.2026 00:53 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

A big trend in the last century is putting a much larger fraction of pop. into research tasks.

This counteracts the diminishing returns from eating the low hanging fruit. Result is smooth linear progress.

The same with AI, we'll all become automators, progress will be steady-ish?

26.02.2026 21:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I don't use block lists to pass judgement. I'm sure the people I block have thoughtful and reasonable things to say in general.

That said, I think it's perfectly fine to cultivate ones garden. People should be free to block whoever they want.

26.02.2026 20:41 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
SWE-Bench Pro is even worse β€” LessWrong Yesterday, OpenAI announced that they would be no longer using SWE-Bench Verified, instead recommending SWE-Bench Pro. …

I forgot to mention that all of this was inspired by OAI raising issues with SWE-Bench-Verified and it turns out their alternative SWE-Bench-Pro is worse:

www.lesswrong.com/posts/nAMhbz...

26.02.2026 19:56 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

LLM's are general purpose priors. You need to teach them. When you find a problem your AI gets slightly wrong, take note!

Structure your work to be automated, iterate, push up the 9's of reliability, move to a new problem. This is the first and final project of humanity.

(9/9)

26.02.2026 19:35 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
RL-as-a-Service will outcompete AGI companies (and that's good) Hyping up a safer way to develop AI.

This will bring RL-as-a-Service to the fore. Experts with the tacit knowledge to figure out how to apply AI to tasks will hold the key to further progress.

(8/9)

splittinginfinity.substack.com/p/rl-as-a-se...

26.02.2026 19:35 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
The end of benchmarks Imagine a society of apes trying to play tic tac toe. One day, they meet a very good human player, who also has a computer that has a superintelligent AI on it, trained to play tic tac toe. From their...

But now we're running out of benchmarks. We have to work harder to find errors. And how do we make progress if we can't differentiate between good and better?

(7/9)

www.sam-rodriques.com/post/the-end...

26.02.2026 19:35 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
So if Data Is All You Need, what does that mean for AI development? A large and low-error dataset is sufficient to train a model to perform any task.

John von Neumann had it right; we can make an AI that can do anything so long as we can specify it. We’ll push forward the only way we know how, with bigger datasets and better benchmarks. We’ll find better methods by GitHub Descent. We’ll beg Scott Alexander for training data.

Once a benchmark is saturated we’ll make a new one. The benchmark doesn’t teach valuable skills? A stock trading benchmark! A benchmark can’t capture romantic love? A sexbot benchmark! Benchmark can’t identify AGI? ARC-AGI_final_v5!

This why machine learning took off in the first place. The culture of openness, competitive benchmarks, and easy-to-steal ideas created a singularity of self-improvement. There’s no limit to what our hive mind can do. Simulate all of biology from gene sequencing data. Strap a Go Pro to everyone’s head to emulate human behavior. Send robots to the stars and start iterating.

All takes is the entire semiconductor industry, half of academia, a solar-industrial revolution, and everyone who ever posted on the internet.

So if Data Is All You Need, what does that mean for AI development? A large and low-error dataset is sufficient to train a model to perform any task. John von Neumann had it right; we can make an AI that can do anything so long as we can specify it. We’ll push forward the only way we know how, with bigger datasets and better benchmarks. We’ll find better methods by GitHub Descent. We’ll beg Scott Alexander for training data. Once a benchmark is saturated we’ll make a new one. The benchmark doesn’t teach valuable skills? A stock trading benchmark! A benchmark can’t capture romantic love? A sexbot benchmark! Benchmark can’t identify AGI? ARC-AGI_final_v5! This why machine learning took off in the first place. The culture of openness, competitive benchmarks, and easy-to-steal ideas created a singularity of self-improvement. There’s no limit to what our hive mind can do. Simulate all of biology from gene sequencing data. Strap a Go Pro to everyone’s head to emulate human behavior. Send robots to the stars and start iterating. All takes is the entire semiconductor industry, half of academia, a solar-industrial revolution, and everyone who ever posted on the internet.

This was the conclusion of my post on AI scaling, AI progress will flow from better benchmarks. (6/9)

26.02.2026 19:35 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Preview
Data Science at the Singularity A purported `AI Singularity' has been in the public eye recently. Mass media and US national political attention focused on `AI Doom' narratives hawked by social media influencers. The European Commis...

Language models are general purpose priors that can learn any task, given that you've defined it properly.

As Donoho's Disciples know, the process of properly defining a task and providing data for it *is* the source of AI progress. Everything else is secondary.

(5/9)

arxiv.org/abs/2310.00865

26.02.2026 19:35 β€” πŸ‘ 9    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

And new benchmarks don't last long. Look at what happened to ARC-AGI. It became a focal point, research and iteration and scale ($) pushed us up the s-curve.

(4/9)

26.02.2026 19:35 β€” πŸ‘ 9    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1
Post image

Just look at the state of benchmarks right now. Most of these are over ~70%. Its exciting when there's a new benchmark that models get below 50% on.

(3/9)

26.02.2026 19:35 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

In practice it's actually really hard to find a defined problem that AI can't do. Define a question with a clear answer that other experts would agree on that stumps AI.

It takes a while, several companies paying PhD's for each task that beats the model.

(2/9)

bsky.app/profile/hars...

26.02.2026 19:35 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

I think benchmarks are the current and future bottleneck for AI. This is a big problem and not enough people are working on it. 🧡

(1/9)

26.02.2026 19:35 β€” πŸ‘ 33    πŸ” 6    πŸ’¬ 2    πŸ“Œ 1