Samidh's Avatar

Samidh

@samidh.bsky.social

Co-Founder at Zentropi (Trustworthy AI). Formerly Meta Civic Integrity Founder, Google X and Google Civic Innovation Lead, and Groq CPO.

808 Followers  |  91 Following  |  50 Posts  |  Joined: 01.09.2023  |  1.8569

Latest posts by samidh.bsky.social on Bluesky

This was such a cool experiment that I created a Zentropi labeler with a simplified version of the authors' Partisan Animosity criteria. Now anyone can experiment directly with using this labeler to try to reduce the temperature of affective polarization in their feeds. zentropi.ai/labelers/b30...

03.12.2025 00:53 β€” πŸ‘ 8    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
Observations on Toxicity We've published Zentropi's toxicity labeler (toxicity-public-s5), which you can integrate with your platform instantly using the Zentropi API. Browse the full policy to see how defining observable fea...

We just wrote an in-depth post about Toxic Content labeling. It presents a new way of defining toxic speech online-- and illustrates the importance of observable features for accurate language model interpretability. Would love to hear how YOU define toxicity, too! blog.zentropi.ai/observations...

13.11.2025 22:47 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Awesome to see how this is already being used! One of the most useful aspects is that the published policies show what it takes to write content rules that can be accurately interpreted by language models. We hope this can be a boost to the broader content policy community.

12.11.2025 00:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

For clarity, the whole point of this launch is to enable people to easily customize their own policies so that we can support a plurality of content classification perspectives online! It is actually a solution to the problem Evelyn highlights in that piece.

11.11.2025 16:37 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

This was a fun launch! It turns Zentropi into a Github for Content Labelers. You can share content policies with others and build off each other's work. It's the easiest way of deploying a fully customizable classifier. Check out the policies @dwillner.bsky.social created at zentropi.ai/u/dave

10.11.2025 23:58 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Content policies are usually private, one-off efforts. You build yours, I build mine, we don't share much about what works or why. This makes sense given products can (and should) set different policies based on their communities, but it leaves us reinventing the wheel. 🧡 1/5

10.11.2025 20:10 β€” πŸ‘ 18    πŸ” 7    πŸ’¬ 2    πŸ“Œ 3

#MakeMasnickWrongAgain

25.09.2025 15:52 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Social media algorithms push people who are near the extremes to further extremes. But it doesn't have to be this way.

11.09.2025 04:55 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Zentropi LLM Policy Writing Workshop Signup By popular demand, we will be hosting a virtual version of our sold-out TrustCon workshop on how to write high quality content policies with and for LLMs. In this session, you will learn best practic...

We got really positive feedback on the TrustCon workshop we ran on writing good content policies for LLMs...so we're doing it again! If you're interested go sign up here, so we can start to figure out timing: forms.gle/tj7vf7ng8n7R...

27.08.2025 18:11 β€” πŸ‘ 2    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

The standard damage control post from tech companies in these situations is to implicitly put the responsibility/blame on users. We can (and should) debate whether OpenAI's promised mitigations are sufficient, but the fact that they are tackling the *product problem* head-on is a vital first step.

27.08.2025 07:19 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Helping people when they need it most How we think about safety for users experiencing mental or emotional distress, the limits of today’s systems, and the work underway to refine them.

This response to the Raine tragedy from OpenAI does something remarkable: it has the humility to acknowledge that a *product failure* led to real-world harm. Despite horrific circumstances, it has a rare degree of honesty that I wish tech companies would show more often. openai.com/index/helpin...

27.08.2025 07:19 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

Yes, would definitely love to directly integrate with roost/coop. Let's chat!

23.08.2025 03:08 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Would love for you to try plugging in zentropi next time! Can just point Claude at our documentation at zentropi.ai/api and it should just work after you generate an API key. I've actually done this myself for a Bluesky feed prototype.

23.08.2025 02:44 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Reasonably sized PBCs FTW.

23.08.2025 02:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We've done some benchmarking on how well gpt-oss-20b does on classification tasks. Significant step up from gpt-4o, especially for normie policies, but still stubbornly inflexible for custom policies.

23.08.2025 02:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1
Preview
Zentropi: Build Your Own Content Labeler in Minutes, Not Months We are officially opening up our build-your-own-content-labeler platform to everyone. Check it out at zentropi.ai.

We are opening up Zentropi.ai to everyone today so that anyone can build their own content labeler. What started as a crazy academic idea 2 years ago is now a real thing that companies are using in production to safeguard their AI-powered systems. Give it a shot! blog.zentropi.ai/zentropi-bui...

19.08.2025 16:01 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Don't take our word for it! Go kick the tires at zentropi.ai and build your own content labeler (no subscription required!)

31.07.2025 21:43 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

@mmasnick.bsky.social I have a bluesky demo for you that you might want to see :)

31.07.2025 21:41 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Just tested this on a few that I know Reddit’s existing Hatred & Harassment automation has blindspots for; it built a focused, accurate labeler in under 10 minutes / a dozen examples, & the human readable criteria it built could be dropped into a training manual / erratum / used to build a regex

31.07.2025 19:48 β€” πŸ‘ 12    πŸ” 5    πŸ’¬ 2    πŸ“Œ 0

So excited for #TrustCon this week! We will be publicly unveiling Zentropi, a platform that helps people instantly build their own content labelers. We'll be opening it up for early access and open sourcing the underlying language model we trained for the task so that it is accessible to everyone.

21.07.2025 01:13 β€” πŸ‘ 8    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

I expect @dwillner.bsky.social to run around like a maniac again at #Trustcon this year as he shows off Zentropi -- our platform that makes it simple to build your own CoPE-powered content labeler.

18.07.2025 19:30 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

A year ago at #TrustCon, I ran around like a maniac showing people something on my laptop. We'd just gotten CoPE - our policy interpretation model - working. It felt like a huge achievement, validating our ideas about LLM-powered labeling. 🧡 1/7

18.07.2025 19:21 β€” πŸ‘ 39    πŸ” 8    πŸ’¬ 2    πŸ“Œ 3

Take back your attention.

21.01.2025 17:12 β€” πŸ‘ 23492    πŸ” 2199    πŸ’¬ 412    πŸ“Œ 104
Preview
Supreme Court Upholds Law That Threatens US TikTok Ban The Supreme Court unanimously upheld a law that threatens to shut down the wildly popular TikTok social media platform in the US as soon as Sunday, ruling that free speech rights must yield to concern...

The splinternet accelerates. If this stands, look for more countries in 2025 to ban Facebook, Instagram, YouTube, etc. out of fears of American surveillance. www.bloomberg.com/news/article...

17.01.2025 20:39 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

When I access that @nytimes.com article, it doesn't say "for Facebook's culture". Instead it says "for an inclusivity initiative at Facebook that encouraged employees’ self-expression in the workplace". Was it edited or updated?

16.01.2025 23:45 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Preview
Types of content we demote | Transparency Center Meta regularly publishes reports to give our community visibility into community standards enforcement, government requests and internet disruptions

@caseynewton.bsky.social In terms of other news feed demotions that could be rolled back (or maybe already have been), take a look at this blog post. Would be smart to track it as it is frequently updated: transparency.meta.com/features/app...

15.01.2025 05:21 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
No Evidence Taylor Swift Donated $10M to California Wildfire Fund If the pop star has donated, she hasn't announced it publicly.

A huge chunk of demoted misinfo on FB are not explicitly about politics at all. Now these hoaxes will get massive amplification. One recent example from Snopes (which was an important fact checker!): www.snopes.com/news/2025/01...

15.01.2025 05:04 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Zuckerberg Says Most Companies Need More β€˜Masculine Energy’ Mark Zuckerberg lamented the rise of β€œculturally neutered” companies that have sought to distance themselves from β€œmasculine energy,” adding that it’s good if a culture β€œcelebrates the aggression a bi...

If someone on my team at Meta had ever said these kinds of words, I'm pretty sure I'd have had an obligation to notify HR. But maybe times have changed and "masculine energy" is now part of the performance review rubric. www.bloomberg.com/news/article...

12.01.2025 20:01 β€” πŸ‘ 21    πŸ” 2    πŸ’¬ 4    πŸ“Œ 2

Don't forget taking out TikTok.

12.01.2025 19:36 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

That's interesting. Seems like the deadline for donations could be set to be before the election then?

12.01.2025 01:29 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@samidh is following 20 prominent accounts