Samidh's Avatar

Samidh

@samidh.bsky.social

Co-Founder at Zentropi (Trustworthy AI). Formerly Meta Civic Integrity Founder, Google X and Google Civic Innovation Lead, and Groq CPO.

845 Followers  |  96 Following  |  62 Posts  |  Joined: 01.09.2023
Posts Following

Posts by Samidh (@samidh.bsky.social)

... image classifiers also!

20.02.2026 23:42 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

If you're not using either tool yet, now's a good time to try both! Zentropi's Community Edition is free and gives you unlimited labelers. Coop is fully open source and runs on your infrastructure.

:D

19.02.2026 22:06 β€” πŸ‘ 8    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Thank you for your leadership and for being great stewards of Cove/Coop.

19.02.2026 19:20 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Zentropi Now Powers Coop Zentropi labelers can now be used as classifiers within Coop, ROOST's Open Source Moderation Platform

@dwillner.bsky.social and I have spent years watching T&S teams rebuild the same infrastructure from scratch. This is what it looks like when open tools actually work together instead. Really proud of this one and appreciative of @roost.tools's leadership!

Details: blog.zentropi.ai/zentropi-now...

19.02.2026 18:55 β€” πŸ‘ 4    πŸ” 2    πŸ’¬ 0    πŸ“Œ 2

Zentropi is now integrated into Coop, @roost.tools's open source moderation platform. You can write a content policy in plain English on Zentropi, plug it into Coop as a signal, and have a moderation pipeline running in minutes.

19.02.2026 18:55 β€” πŸ‘ 6    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Preview
AI is Removing Bottlenecks to Effective Content Moderation at Scale Zentropi's Dave Willner says LLM-driven technology can now accomplish content classification at the scale necessary for moderation on large platforms.

Dave Willner, who led trust and safety at major tech firms and has cofounded a company that is developing an AI content classification platform, says LLM-driven technology can now accomplish classification at the scale necessary for moderation on large platforms. That has substantial implications.

29.01.2026 15:04 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 1    πŸ“Œ 2

I can has cats.

26.01.2026 18:55 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Zentropi Now Labels Images Building guardrails for visual content just got a lot easier. Today we're launching image classification on Zentropi and announcing cope-b-12b, a multimodal model that powers this experience.

Just shipped Zentropi's most requested feature: image classification!

Now analyze images against your own policies, at scale.

To power it we built cope-b-12b, a new multimodal model w/ native vision.

Check out the cat detector we made in < 1 min. 🐱
blog.zentropi.ai/zentropi-now-labels-images/

26.01.2026 18:03 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

On the other hand, interesting contribution to do this all with a single transformer and candidate isolation.

21.01.2026 23:22 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - xai-org/x-algorithm: Algorithm powering the For You feed on X Algorithm powering the For You feed on X. Contribute to xai-org/x-algorithm development by creating an account on GitHub.

If you are looking for a technical description of how X rots your brain, look no further than their github post on the 'X algorithm'. It is pure, unadulterated behavioral engagement maximization that amplifies the very worst human impulses. github.com/xai-org/x-al...

21.01.2026 23:21 β€” πŸ‘ 45    πŸ” 22    πŸ’¬ 3    πŸ“Œ 1

Would love to hear more! What kind of community guidelines were you feeding to CoPE? What worked well and where were there gaps?

16.01.2026 03:34 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Why are we just giving away all our secrets? Well, it is our hope that it helps the ecosystem further advance the state of the art in policy-steerable content classification, which is foundational to a more trustworthy internet.

15.01.2026 19:21 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Dave just published a Zentropi labeler that can precisely identify requests at prompting an AI model to undress a person in a photo. The tools exist to easily deal with this problem -- platforms just need to choose to use them. If you are the developer of an AI system, please use this guardrail!

13.01.2026 20:30 β€” πŸ‘ 5    πŸ” 3    πŸ’¬ 0    πŸ“Œ 1

"We'll make it right for you"

10.01.2026 04:47 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

This was such a cool experiment that I created a Zentropi labeler with a simplified version of the authors' Partisan Animosity criteria. Now anyone can experiment directly with using this labeler to try to reduce the temperature of affective polarization in their feeds. zentropi.ai/labelers/b30...

03.12.2025 00:53 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
Observations on Toxicity We've published Zentropi's toxicity labeler (toxicity-public-s5), which you can integrate with your platform instantly using the Zentropi API. Browse the full policy to see how defining observable fea...

We just wrote an in-depth post about Toxic Content labeling. It presents a new way of defining toxic speech online-- and illustrates the importance of observable features for accurate language model interpretability. Would love to hear how YOU define toxicity, too! blog.zentropi.ai/observations...

13.11.2025 22:47 β€” πŸ‘ 10    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Awesome to see how this is already being used! One of the most useful aspects is that the published policies show what it takes to write content rules that can be accurately interpreted by language models. We hope this can be a boost to the broader content policy community.

12.11.2025 00:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

For clarity, the whole point of this launch is to enable people to easily customize their own policies so that we can support a plurality of content classification perspectives online! It is actually a solution to the problem Evelyn highlights in that piece.

11.11.2025 16:37 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

This was a fun launch! It turns Zentropi into a Github for Content Labelers. You can share content policies with others and build off each other's work. It's the easiest way of deploying a fully customizable classifier. Check out the policies @dwillner.bsky.social created at zentropi.ai/u/dave

10.11.2025 23:58 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Content policies are usually private, one-off efforts. You build yours, I build mine, we don't share much about what works or why. This makes sense given products can (and should) set different policies based on their communities, but it leaves us reinventing the wheel. 🧡 1/5

10.11.2025 20:10 β€” πŸ‘ 18    πŸ” 6    πŸ’¬ 2    πŸ“Œ 3

#MakeMasnickWrongAgain

25.09.2025 15:52 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Social media algorithms push people who are near the extremes to further extremes. But it doesn't have to be this way.

11.09.2025 04:55 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Zentropi LLM Policy Writing Workshop Signup By popular demand, we will be hosting a virtual version of our sold-out TrustCon workshop on how to write high quality content policies with and for LLMs. In this session, you will learn best practic...

We got really positive feedback on the TrustCon workshop we ran on writing good content policies for LLMs...so we're doing it again! If you're interested go sign up here, so we can start to figure out timing: forms.gle/tj7vf7ng8n7R...

27.08.2025 18:11 β€” πŸ‘ 2    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

The standard damage control post from tech companies in these situations is to implicitly put the responsibility/blame on users. We can (and should) debate whether OpenAI's promised mitigations are sufficient, but the fact that they are tackling the *product problem* head-on is a vital first step.

27.08.2025 07:19 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Helping people when they need it most How we think about safety for users experiencing mental or emotional distress, the limits of today’s systems, and the work underway to refine them.

This response to the Raine tragedy from OpenAI does something remarkable: it has the humility to acknowledge that a *product failure* led to real-world harm. Despite horrific circumstances, it has a rare degree of honesty that I wish tech companies would show more often. openai.com/index/helpin...

27.08.2025 07:19 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

Yes, would definitely love to directly integrate with roost/coop. Let's chat!

23.08.2025 03:08 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Would love for you to try plugging in zentropi next time! Can just point Claude at our documentation at zentropi.ai/api and it should just work after you generate an API key. I've actually done this myself for a Bluesky feed prototype.

23.08.2025 02:44 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Reasonably sized PBCs FTW.

23.08.2025 02:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We've done some benchmarking on how well gpt-oss-20b does on classification tasks. Significant step up from gpt-4o, especially for normie policies, but still stubbornly inflexible for custom policies.

23.08.2025 02:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1
Preview
Zentropi: Build Your Own Content Labeler in Minutes, Not Months We are officially opening up our build-your-own-content-labeler platform to everyone. Check it out at zentropi.ai.

We are opening up Zentropi.ai to everyone today so that anyone can build their own content labeler. What started as a crazy academic idea 2 years ago is now a real thing that companies are using in production to safeguard their AI-powered systems. Give it a shot! blog.zentropi.ai/zentropi-bui...

19.08.2025 16:01 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0