Dave Willner's Avatar

Dave Willner

@dwillner.bsky.social

Co-Founder at Zentropi. Formerly Head of Trust & Safety at OpenAI, of Community Policy at Airbnb, and of Content Policy Facebook. Strictly cold takes.

9,644 Followers  |  1,869 Following  |  216 Posts  |  Joined: 06.05.2023  |  2.4021

Latest posts by dwillner.bsky.social on Bluesky

That's not how the End Times work Mike!

22.11.2025 01:41 โ€” ๐Ÿ‘ 7    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Bellingcatโ€™s contact email has always been a magnet for people with fairly unusual views; paranoid delusions, sprawling conspiracies, the works. But recently, the pattern has shifted, weโ€™re seeing more and more emails clearly written with ChatGPT.

19.11.2025 14:18 โ€” ๐Ÿ‘ 3089    ๐Ÿ” 813    ๐Ÿ’ฌ 54    ๐Ÿ“Œ 268

this administration, and its congressional allies, are free speech phonies. not warriors. phonies. censors. propagandists.

20.11.2025 01:55 โ€” ๐Ÿ‘ 13    ๐Ÿ” 5    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
Preview
DHS playing 'whack-a-mole' shooting down made-up ICE stories The Department of Homeland Security is stepping up efforts to combat fake news, viral AI videos, and misinformation on ICE and Border Patrol.

Just wanna re-up in simple terms that when Biden talked to platforms, Jim Jordan launched years of investigations into everybody involved, said it was tyranny, censorship, etc.

And now they just straight up acknowledge that they talk to platforms too.

www.washingtonexaminer.com/news/crime/3...

20.11.2025 01:54 โ€” ๐Ÿ‘ 519    ๐Ÿ” 146    ๐Ÿ’ฌ 10    ๐Ÿ“Œ 8

๐Ÿ’ธ

18.11.2025 03:44 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Totally feel free to point folks to the blog, it wonโ€™t poison anything!

13.11.2025 23:14 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Observations on Toxicity We've published Zentropi's toxicity labeler (toxicity-public-s5), which you can integrate with your platform instantly using the Zentropi API. Browse the full policy to see how defining observable fea...

We just wrote an in-depth post about Toxic Content labeling. It presents a new way of defining toxic speech online-- and illustrates the importance of observable features for accurate language model interpretability. Would love to hear how YOU define toxicity, too! blog.zentropi.ai/observations...

13.11.2025 22:47 โ€” ๐Ÿ‘ 5    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Observations on Toxicity We published a novel toxicity labeler (toxicity-public-s5), which you can integrate with your platform instantly using the Zentropi API. Browse the full policy to see how defining observable features ...

Iโ€™ve had a very โ€œtext-orientedโ€ view of content labeling for a long time, and used the opportunity of our recent launch to lay out some of those ideas in the context of the idea of โ€œtoxicityโ€

Interested to know what others think!

blog.zentropi.ai/observations...

13.11.2025 22:56 โ€” ๐Ÿ‘ 6    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Zentropi - Build Custom Content Labelers Instantly

Find these at zentropi.ai and on my profile at zentropi.ai/u/dave. As more folks write and publish policies, we'll be adding the best ones to the featured section to give people more options. ๐Ÿงต 5/5

10.11.2025 20:10 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Our goal here isn't to provide perfect policies or tell anyone what rules they should have. It's to provide examples of what actually works when writing content policies for LLM interpretation that anyone can then adapt to fit their own needs. ๐Ÿงต 4/5

10.11.2025 20:10 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

To start things off, I've built 7 policies - harassment, hate, violence, self-harm, sexual content, drugs, and toxicity. All created using Zentropi itself, with a touch of editing on my end. Examples of what's possible, starting points to adapt. ๐Ÿงต 3/5

10.11.2025 20:10 โ€” ๐Ÿ‘ 6    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

We're hoping to change this by encouraging more public sharing of people's work. So, today, we're launching policy discovery and sharing on Zentropi. Browse featured policies, find more from authors you like, fork what fits, and adapt for your context. ๐Ÿงต 2/5

10.11.2025 20:10 โ€” ๐Ÿ‘ 4    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

Content policies are usually private, one-off efforts. You build yours, I build mine, we don't share much about what works or why. This makes sense given products can (and should) set different policies based on their communities, but it leaves us reinventing the wheel. ๐Ÿงต 1/5

10.11.2025 20:10 โ€” ๐Ÿ‘ 18    ๐Ÿ” 7    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 3

You should resign both your leadership position and your seat so that someone who is up to this can take over.

10.11.2025 03:54 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

*whispers* you can continue to read me, the pundit who insisted the other pundits were wrong about these conclusions

05.11.2025 15:06 โ€” ๐Ÿ‘ 5907    ๐Ÿ” 557    ๐Ÿ’ฌ 67    ๐Ÿ“Œ 15

Go U Bears!

05.11.2025 02:54 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Picture of the East Wing demolition of the White House taken on my flight out of DCA.

23.10.2025 17:16 โ€” ๐Ÿ‘ 15066    ๐Ÿ” 5849    ๐Ÿ’ฌ 1205    ๐Ÿ“Œ 1081

Itโ€™s recorded iirc, should be up on YouTube!

22.10.2025 22:43 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Automating Content Policy AI is no longer just moderating individual posts โ€” it is learning how to interpret and enforce policy itself. Dave Willner โ€” who has led trust and safety teams at Facebook, Airbnb, and OpenAI โ€” joins ...

I am forgetful about it self-promotion, so dropping a last minute link to note that Iโ€™m giving a talk Berkman Klein today. Come check it out if youโ€™re free, or catch the recording later:

cyber.harvard.edu/events/autom...

22.10.2025 16:08 โ€” ๐Ÿ‘ 20    ๐Ÿ” 4    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 2

I feel like some of the difference in reactions here also rests on on frequently you have to do a somewhat complex, but very repetitive, task. Taking the time to get these sort of workflows really dialed in is most useful for stuff you do over and over.

17.10.2025 22:00 โ€” ๐Ÿ‘ 7    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Iโ€™ve never understood why, because itโ€™s not like this was a subtle theme!

16.10.2025 03:02 โ€” ๐Ÿ‘ 11    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

He would definitely have also hated that ๐Ÿ˜‚

16.10.2025 03:01 โ€” ๐Ÿ‘ 10    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The thing that always gets me is that Tolkien obviously would have hated Silicon Valley generally, and these particular guys specifically.

16.10.2025 00:52 โ€” ๐Ÿ‘ 333    ๐Ÿ” 35    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 8

Tyranny is brittle.

10.10.2025 17:16 โ€” ๐Ÿ‘ 53    ๐Ÿ” 11    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Right? I donโ€™t even know what the current fight is about, but letโ€™s not be silly now.

03.10.2025 15:11 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

So, the first part of this is plainly false, both historically and currently. I donโ€™t think itโ€™s a good thing in most casesโ€ฆbut itโ€™s plainly the case that pressuring the people in charge of moderation to either ban (or not ban) people works *All The Time*. It is why people do it!

03.10.2025 15:03 โ€” ๐Ÿ‘ 50    ๐Ÿ” 9    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 1
Preview
Moderating is Such Sweet Sorrow - Ctrl-Alt-Speech In this weekโ€™s roundup of the latest news in online speech, content moderation and internet regulation, Mike is joined by Dave Willner, founder of Zentropi, and long-time trust & safety expert who...

New Ctrl-Alt-Speech: Moderating is Such Sweet Sorrow with guest host @dwillner.bsky.social who is entirely responsible for bringing up Shakespeare as part of this discussion. (@benwhitelaw.bsky.social will be back next week!)

podcast.ctrlaltspeech.com/2315966/epis...

01.10.2025 23:25 โ€” ๐Ÿ‘ 12    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

While terrible, this is entirely unsurprising. If you hold serious safety efforts in contempt, this sort of thing is inevitable.

23.09.2025 04:59 โ€” ๐Ÿ‘ 78    ๐Ÿ” 22    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Disney/ABC have a responsibility to refuse to participate in corruption.

Kimmel must be reinstated. If Disney/ABC agree to this extortion then perhaps creatives + workers should consider collective action to push back. Same w/buying park + cruise tickets if they bow.

People have power. Ask Target

20.09.2025 01:13 โ€” ๐Ÿ‘ 59406    ๐Ÿ” 13447    ๐Ÿ’ฌ 1804    ๐Ÿ“Œ 531

No one who agrees to this is a journalist.

20.09.2025 00:27 โ€” ๐Ÿ‘ 11    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@dwillner is following 20 prominent accounts