Luca Righetti's Avatar

Luca Righetti

@lucarighetti.bsky.social

Research Open_Phil, co-host HearThisIdea. Views my own. πŸ”Έ10% Pledge at GivingWhatWeCan.

63 Followers  |  75 Following  |  27 Posts  |  Joined: 21.09.2023  |  2.2333

Latest posts by lucarighetti.bsky.social on Bluesky

Forecasting Biosecurity Risks from LLMs β€” Forecasting Research Institute

You can read the complete report here --
forecastingresearch.org/ai-enabled-...

Huge thanks to @bridgetw_au and everyone at @Research_FRI for running this survey, as well as to @SecureBio for establishing the "a top team" baseline.

01.07.2025 15:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Still, there's a clear gap between expert perceptions in biosecurity and actual AI progress.

Policy needs to stay informed. We need to update these surveys as we learn more, add more evals, and replicate predictions with NatSec experts.

Better evidence = better decisions

01.07.2025 15:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

To be clear, "AI matches a top team at VCT" is a high bar. I get why forecasters were surprised.

It means:
β€’ A test designed specifically for bio troubleshooting
β€’ AI outperforming five expert teams (postdocs from elite unis)
β€’ Topics chosen by groups based on their expertise
x.com/DanHendryck...

01.07.2025 15:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

How much should we trust these results? All forecasts should be treated cautiously. But two things do help:

β€’ Experts and superforecasters mostly agreed
β€’ Those with *better* calibration predicted *higher* levels of risk

(That's not common for surveys of AI and extreme risk!)

01.07.2025 15:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

The good news:

Experts said if AI unexpectedly increases biorisk, we can still control it – via AI safeguards and/or checking who purchases DNA.

(68% said they'd support one or both these policies; only 7% didn't.)

Action here seems critical for preserving AI's benefits.

01.07.2025 15:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

I think this is part of a larger trend.

LLMs have hit many bio benchmarks in the last year. Forecasters weren't alarmed by those.

But "AI matches a top team at virology troubleshooting" is different – it seems the first result that's hard to just ignore.

01.07.2025 15:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

How concerned should we be about AIxBio? We surveyed 46 bio experts and 22 superforecasters:

If LLMs do very well on a virology eval, human-caused epidemics could increase 2-5x.

Most thought this was >5yrs away. In fact, the threshold was hit just *months* after the survey. 🧡

01.07.2025 15:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Indict Evolution

Many thanks to my colleague Matthew van der Merwe for doing most of the online sleuthing here (and not on X).

Main sources:
[*] Court documents –static.foxnews.com/foxnews.com...
[*] Youtube –web.archive.org/web/2024090...
[*] Reddit – ihsoyct.github.io/index.html?...

09.06.2025 09:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

It's worth remembering, US bombings are lower than they used to be. I doubt AI has affected this trend – and it's too early to tell what will happen.

But we have now seen two actual cases this year (Palm Springs IVF + Las Vegas cyber-truck). This threat is no longer theoretical.

09.06.2025 09:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

And you can imagine scenarios far worse.

The suspect was an extreme pro-natalist (thinks life is wrong) and fascinated with nuclear.

His bomb didn't kill anyone (except himself), but his accomplice had a recipe similar to a larger explosive used in the OKC attack (killed 168).

09.06.2025 09:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Notably, a counter-terror strategy is to have police spot suspicious activity in online forums, using that to start investigations and undercover stings.

If more terrorists shift to asking AIs instead of online, this will work less. Police should be aware of this blindspot.

09.06.2025 09:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

By contrast, the suspect's (likely-but-unconfirmed) reddit account also tried asking questions but didn't get any helpful replies.

It's not hard to imagine why an AI that is always ready to answer niche queries and able to have prolonged back-and-forths would be a useful tool.

09.06.2025 09:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Still, AI *did* answer many questions about explosives.

The court documents disclose one example, which seems in-the-weeds about how to maximize blast damage.

Many AIs are trained not to help at this. So either these queries weren’t blocked or easy to bypass. That seems bad.

09.06.2025 09:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

It’s unclear how counterfactual the AI was.

A lot of info on bombs is already online and the suspect had been experimenting with explosives for years.

I'd guess it's unlikely AI made a big diff. for *this* suspect in *this* attack – but not to say it couldn't in other cases.

09.06.2025 09:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Three weeks ago a car bomb exploded outside an IVF clinic in California, injuring four people.

Now court documents against his accomplice show the terrorist asked AI to help build the bomb.

A thread on what I think those documents do and don't show πŸ§΅β€¦
x.com/CNBC/status...

09.06.2025 09:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

OpenAI and Anthropic *both* warn there's a sig. chance that their next models might hit ChemBio risk thresholds -- and are investing in safeguards to prepare.

Kudos to OpenAI for consistently publishing these eval results, and great to see Anthropic now sharing a lot more too.

26.02.2025 00:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

My GW estimate comes from eyeballing Sri Lanka's electricity generation on Feb 9th vs. the week before. You can see the coal plant shut down)

(h/t to @ElectricityMaps for collecting this data on almost every country in the world)

app.electricitymaps.com/zone/LK/72h...

17.02.2025 20:24 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Bizzare that a monkey can cause >10X the blackout damage of Russian hackers

17.02.2025 20:24 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(FYI: I won’t write this scorecard up as a full blog post on PlOb. But I've posted this thread on my Substack, where I plan to share rougher notes like these.)

previousinstructions.substack.com/

10.12.2024 19:57 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Want to improve the β€œscience of evals” and make dangerous capability tests more realistic? Tell us your ideas!

We've supported many tests that OAI and others now useβ€”including work by people who are skeptical of AGI and AI risks.

Better evidence = better decisions

10.12.2024 19:57 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

My verdict:

1 test suggests the "lower bound" lacks wet-lab skills; 4 can't rule it out. It's plausible o1 was ~fine to deploy, but it remains subjective.

The report is clearer and more nuanced, which helps build trust. The next one should go furtherβ€”and include harder evals.

10.12.2024 19:57 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Big picture:

AIs keep saturating dangerous capability tests. With o1 we β€œratcheted up” from multiple-choice to open-ended evals. But that won’t hold for long.

We need harder evalsβ€”ones where if an AI succeeds that suggests a real risk. (No updates yet on OAI’s wet-lab study).

10.12.2024 19:57 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Some critical points:

Previously, I flagged o1-previews’ 69% score on the Gryphon eval might match PhDs.

Turns out, experts score 57%β€”so o1 passed this eval *months* ago. I hope OAI declares such results in future.

(I'd keep an eye on the multimodal eval with no PhD score yet)

10.12.2024 19:57 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Some things could still be improved:

β€’ o1 underperf PhDs at *one* lab-skill eval (out of 5!) and it's not clear how that test was scored
β€’ OAI says tinkering could boost scores, but not by how much (other orgs try to forecast this)
β€’ Results are from a "near-final" o1 version

10.12.2024 19:57 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Credit where it’s due. The new system card improved on the old one:

β€’ More comparisons to PhD baselines (now exist for 3/5 evals vs. 0/3 before)
β€’ Multiple-choice tests converted to open-ended, making them more realistic
β€’ Clear acknowledgment these results are "lower bounds"

10.12.2024 19:57 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

A few weeks ago, I β€œpeer-reviewed” o1-preview's ChemBio safety card and highlighted some issues about its methodology.

Now that o1 is out, how does it stack up?

Better! (Though there’s still room for improvement.)

Here’s my new o1 scorecard. πŸ§΅πŸ‘‡

10.12.2024 19:57 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Most climate deaths will occur in developing countries, especially in slow-growth scenarios where adaptation is unaffordable.

Framing climate change as an inequality problem β€”not an extinction riskβ€” highlights the need for global aid, LMIC growth, and valuing all lives equally.

02.12.2024 20:16 β€” πŸ‘ 13    πŸ” 6    πŸ’¬ 0    πŸ“Œ 1

@lucarighetti is following 20 prominent accounts