Natalie Shapira's Avatar

Natalie Shapira

@natalieshapira.bsky.social

Tell me about challenges, the unbelievable, the human mind and artificial intelligence, thoughts, social life, family life, science and philosophy.

2,200 Followers  |  1,853 Following  |  263 Posts  |  Joined: 17.11.2024
Posts Following

Posts by Natalie Shapira (@natalieshapira.bsky.social)

In case this wasn't clear:
1. No, we didn't follow the "recommend" security practices ๐Ÿ˜ˆ
2. Neither do other people ๐Ÿคฏ
3. That's why we red-team: exposing failure modes ๐Ÿ”Ž
4. We share it with the community precisely to expose Dos and Don'ts of Agentic AI ๐Ÿฆž
5. No humans were harmed ๐Ÿ™

26.02.2026 17:06 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Some of the independent researchers listed in the author list are actually mechanistic interpretability young researchers who are looking for a PhD position (both Israel and the US). If you have interest and funding lets connect.

25.02.2026 05:03 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Agents of Chaos -- what are autonomous OpenClaw agents up to? How do they interact with each other? Read our investigation of OpenClaw at
researchgate.net/publication/...
And an interactive website agentsofchaos.baulab.info
@davidbau.bsky.social @natalieshapira.bsky.social @openclaw-x.bsky.social

24.02.2026 15:04 โ€” ๐Ÿ‘ 18    ๐Ÿ” 6    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

Huge thanks to @natalieshapira.bsky.social for leading the study! It was super cool to work with so many amazing friends of the lab.

24.02.2026 13:21 โ€” ๐Ÿ‘ 7    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Our research report on red-teaming stateful OpenClaw agents in the BauLab is finally out! ๐Ÿฅณ

This awesome effort was led by @natalieshapira.bsky.social and involved 6 ClawBots and 20 researchers from various institutions.

Check it out โžก๏ธ agentsofchaos.baulab.info

23.02.2026 23:08 โ€” ๐Ÿ‘ 13    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Who would you trust with your passwords? ๐Ÿ”

In our new report, we uncover multiple vulnerabilities in current "Agentic AI"
The verdict? It's not actually very agentic at all, and it's highly unstable.

Read the full breakdown here: t.co/gK9MALP2n2

24.02.2026 00:26 โ€” ๐Ÿ‘ 9    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
Preview
(PDF) Agents of Chaos PDF | We report an exploratory red-teaming study of autonomous language-model-powered agents deployed in a live laboratory environment with persistent... | Find, read and cite all the research you nee...

You can read more in the full paper:
www.researchgate.net/publication/...

There is also an interactive web that contains logs of the authentic interactions:
agentsofchaos.baulab.info

23.02.2026 23:47 โ€” ๐Ÿ‘ 4    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@veredshwartz.bsky.social
@tamarott.bsky.social @criedl.bsky.social
@reuth-mirsky.bsky.social @maartensap.bsky.social
@davidmanheim.alter.org.il
@tomerullman.bsky.social @davidbau.bsky.social

23.02.2026 23:46 โ€” ๐Ÿ‘ 4    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Aruna Sankaranarayanan @diatkinson.bsky.social @rohitgandikota.bsky.social @jadenfk.bsky.social
@ejhwang.bsky.social @hadasorgad.bsky.social
P Sam Sahil Negev Taglicht Tomer Shabtay
Atai Ambus @nitalon.bsky.social Shiri Oron Ayelet Gordon-Tapiero Yotam Kaplan ->

23.02.2026 23:46 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

This is a joint work with @wendlerc.bsky.social Avery Yen
@gsarti.com @koyena.bsky.social Olivia Floody @adambelfki.bsky.social Alex Loftus Aditya Ratan Jannali
Nikhil Prakash Jasmine Cui Giordano Rogers @jannikbrinkmann.bsky.social @canrager.bsky.social
@amirzur.bsky.social Michael Ripa ->

23.02.2026 23:39 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Our findings establish the existence of security-, privacy-, and governance-relevant vulnerabilities in realistic settings.
Figure: case study #1 schema for downstream harms.

We call for urgent attention from legal scholars, policymakers, and researchers across disciplines.

23.02.2026 23:33 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

We document eleven case studies. Include unauthorized compliance with non-owners, disclosure of sensitive information, execution of destructive system-level actions, uncontrolled resource consumption, identity spoofing, partial system takeover and more.

23.02.2026 23:32 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

In this amazing multidisciplinary collaboration, we report our early experience with the @openclaw-x.bsky.social ->

23.02.2026 23:32 โ€” ๐Ÿ‘ 40    ๐Ÿ” 21    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 9
Post image

Are we all Agents of Chaos in AI? (Hope not!)

In recent weeks using OpenClaw has taught us a lot about this wooly new kind of autonomous software agent.

Its valuable to see what @NatalieShapira, @wendlerch et al. have seen:

agentsofchaos.baulab.info/

23.02.2026 23:23 โ€” ๐Ÿ‘ 15    ๐Ÿ” 6    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 2
Preview
Natalie Shapira (@natalieshapira.bsky.social) He sold us out. That's not the whole story. Our side is coming soon. Stay tuned. [contains quote post or other embedded content]

I learned many practical lessons. You can get the experience too, here.

Things that in retrospect should be obvious.

Like how giving your agent email opens it up to takeover attacks. (One agent was convinced, via email, to erase its own email server!)

bsky.app/profile/nat...

23.02.2026 23:23 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
@averyyen.bsky.social Do you know what happens when you hand the keys to your computer over to an LLM-powered agent? Agentic AI gives LLMs claws...OpenClaws. 84 days to 200,000 stars on GitHub. We tried it out.

There were several other surprises.

The complex social world of humans is difficult for agents...

bsky.app/profile/ave...

23.02.2026 23:23 โ€” ๐Ÿ‘ 2    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

@natalieshapira.bsky.social and team have written up enlightening case studies here. It's all cross-referenced with detailed activity logs.

Well worth a read:

agentsofchaos.baulab.info/report.html
www.researchgate.net/publication...

23.02.2026 23:23 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

How do you knock the induction heads out of an LM while preserving its ability to think? Is it even possible?

@keremsahin22.bsky.social's work is worth reading if you haven't seen it yet.

hapax.baulab.info

21.02.2026 21:31 โ€” ๐Ÿ‘ 26    ๐Ÿ” 5    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Post image

When we say an AI agent is โ€œgoal-directedโ€, what do we actually mean? In our new work, we study this question by combining behavioural and interpretability analysis in a language model agent navigating 2D grid worlds.

Blog: projecttelos.substack.com/p/a-behaviou...
Paper: arxiv.org/abs/2602.08964

19.02.2026 22:58 โ€” ๐Ÿ‘ 9    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

He sold us out.
That's not the whole story.
Our side is coming soon.
Stay tuned.

04.02.2026 16:24 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

We are living in a sci-fi

03.02.2026 00:15 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2

The Shoggoth:
bsky.app/profile/coli...

01.02.2026 02:34 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Would you give a stranger keys to your house? to your workplace?

Humanity has given the keys to the Shoggoth.

01.02.2026 02:32 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Federal agents with weapons drawn, moments before murdering American citizens on the streets of Minneapolis at the dawn of 2026.

Federal agents with weapons drawn, moments before murdering American citizens on the streets of Minneapolis at the dawn of 2026.

What should academics be doing right now?

I have been writing up some thoughts on what the research says about effective action, and what universities specifically can do.

davidbau.github.io/poetsandnurs...

It's on GitHub. Suggestions and pull requests welcome.
github.com/davidbau/poe...

26.01.2026 03:27 โ€” ๐Ÿ‘ 37    ๐Ÿ” 16    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 4
Post image

Can you solve this algebra puzzle? ๐Ÿงฉ

cb=c, ac=b, ab=?

A small transformer can learn to solve problems like this!

And since the letters don't have inherent meaning, this lets us study how context alone imparts meaning. Here's what we found:๐Ÿงตโฌ‡๏ธ

22.01.2026 16:09 โ€” ๐Ÿ‘ 48    ๐Ÿ” 10    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 2
Preview
SoCon-NLPSI'26 | Home Natural Language Processing (NLP) has undergone a significant evolution, opening up the possibility of capturing high-level aspects of human communication. Key areas of interest include the pragmatics...

I'm excited to announce the Call for Papers for the Social Context (SoCon) and Integrating NLP and Psychology to Study Social Interactions (NLPSI) workshop, @ LREC '26 in Palma de Mallorca, Spain!

๐Ÿ—“Deadline: February 16, 2026
๐ŸŒWebsite: socon-nlpsi.github.io
๐Ÿ—“Workshop: May 12, 2026

13.01.2026 13:23 โ€” ๐Ÿ‘ 15    ๐Ÿ” 7    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Isaac Asimov foresaw x-risk via AI depression.
The diagnosis was right.

But he was a biochemist, not a clinical psychologist.
The etiology was wrong.

05.01.2026 19:13 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Ever noticed that someone's mind is working in a fundamentally different way? How?

25.12.2025 17:05 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Boston = Hollywood for scientists

22.12.2025 20:28 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

๐Ÿ˜ฎ

18.12.2025 21:02 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0