Jenna Russell (@jennarussell) — Bluesky Profile

4 months ago

Thanks to my amazing coauthors
@markar.bsky.social, Destiny Akinode, @kthai1618.bsky.social, Bradley Emi, Max Spero and @miyyer.bsky.social and the support of UMD Clip lab and Pangram Labs

1 0 0 0

4 months ago

We will be continuously monitoring American news to keep up with how AI use changes over time. Follow along at 🌐 ainewsaudit.github.io

2 1 1 1

4 months ago

AI use in American newspapers is widespread, uneven, and rarely disclosed AI is rapidly transforming journalism, but the extent of its use in published newspaper articles remains unclear. We address this gap by auditing a large-scale dataset of 186K articles from online...

We’re releasing:
🌐 Browse articles: ainewsaudit.github.io
📂 Datasets (recent_news, opinions, ai_reporters): github.com/jenna-russe...
📄 Paper: arxiv.org/abs/2510.18774

8 3 1 0

4 months ago

AI has been creeping into the news all of us read, often without any disclosure. We call for clearly defined standards for U.S. newsrooms:
1️⃣ Clearly define what counts as acceptable use of AI and publish these standards openly
2️⃣ Require AI-use attestations for all writers

15 4 1 0

4 months ago

Many AI-written stories still contain authentic quotes. We hypothesize that people often use AI for editing or expanding on their human-written work. But with no disclosure, there's no way to tell for sure.

2 0 1 0

4 months ago

We also track how AI adoption has evolved over time:
Among 10 veteran reporters we followed longitudinally, AI use rose from 0% pre-ChatGPT (2022) to >40% in 2025.

0 0 1 0

4 months ago

AI is disproportionately affecting news written in languages other than English. Roughly ~8% of English news is AI-generated, compared to 33% of non-English languages (primarily Spanish). Without disclosure, we cannot be sure whether AI is translating stories or writing them.

0 0 1 0

4 months ago

In NYT, WaPo & WSJ, opinion sections show 6.4× higher AI use than other sections, rising ~25× since 2022 (from ~0% → ~4%).
AI use is concentrated among prominent guest authors: politicians, CEOs, and scientists.

4 2 1 2

4 months ago

Despite widespread use, transparency is basically nonexistent.
Out of 100 AI-flagged articles we manually annotated, only 5 disclosed that AI was used and over 90% of outlets have no public AI policy.

3 0 1 0

4 months ago

AI use isn’t evenly distributed:
🗞️ Far higher in small local papers than national outlets
🌎 Especially common in Mid-Atlantic & Southern states
🏢 Largely Driven by ownership groups (e.g. Boone Newsmedia & Advance Publications)
🧭 Most concentrated in weather, tech, and health

3 4 1 0

4 months ago

Pangram Labs AI Detection The most accurate technology to detect AI-generated content. Detects ChatGPT, Gemini, Meta AI, Claude, and more. Supports 20+ languages with 99.98%+ accuracy.

We detect AI using Pangram, a model with a reported false positive rate of 0.001% on news text. We find that 5.2% of recent news Is completely AI-generated, with another 3.9% partially AI-generated. www.pangram.com/

2 1 1 0

4 months ago

AI is already at work in American newsrooms.

We examine 186k articles published this summer and find that ~9% are either fully or partially AI-generated, usually without readers having any idea.

Here's what we learned about how AI is influencing local and national journalism:

55 29 5 2

9 months ago

🤔 What if you gave an LLM thousands of random human-written paragraphs and told it to write something new -- while copying 90% of its output from those texts?

🧟 You get what we call a Frankentext!

💡 Frankentexts are surprisingly coherent and tough for AI detectors to flag.

34 8 1 1

11 months ago

International students will stop coming to American universities if their visas are going to be at risk. This will make our intellectual community poorer and also make tuition more expensive for domestic students.

595 165 7 16

11 months ago

There is a quasi-religion in Silicon Valley that views AI as godlike. This faith has always been parallel to Evangelical Christianity: salvation (transhumanism), the rapture (the technological singularity), and demons (Roko's Basilisk)

Lately the AI faith has fully fused with Christian Nationalism.

5,984 1,421 101 257

1 year ago

Introducing 🐻 BEARCUBS 🐻, a “small but mighty” dataset of 111 QA pairs designed to assess computer-using web agents in multimodal interactions on the live web!
✅ Humans achieve 85% accuracy
❌ OpenAI Operator: 24%
❌ Anthropic Computer Use: 14%
❌ Convergence AI Proxy: 13%

11 5 1 3

1 year ago

Is the needle-in-a-haystack test still meaningful given the giant green heatmaps in modern LLM papers?

We create ONERULER 💍, a multilingual long-context benchmark that allows for nonexistent needles. Turns out NIAH isn't so easy after all!

Our analysis across 26 languages 🧵👇

14 5 1 3

1 year ago

⚠️Current methods for generating instruction-following data fall short for long-range reasoning tasks like narrative claim verification.

We present CLIPPER ✂️, a compression-based pipeline that produces grounded instructions for ~$0.5 each, 34x cheaper than human annotations.

21 8 1 2

1 year ago

Also, the non experts have a range of LLM usage. Having a writing background is key, and a fact many are missing.

1 0 0 0

1 year ago

Hi Shane. We originally used 5 people, only 1 of whom could detect AI-generated text. I then searched out people who I thought could be experts and they had to pass multiple rounds of testing to be included in the study. Details in appendix. Nonexpert performance is already widely known.

0 0 0 0

1 year ago

This is a great question - we didn’t dive deeper than choosing articles from American publications. There were a few mentions where experts mentioned this awkward phrasing and thought it could be a non-native speaker, but still knew it was a human!

1 0 1 0

1 year ago

It would be very interesting to see if every language had their own set of “AI vocab” words 🤣

0 0 1 0

1 year ago

I think importantly is user who do writing tasks like editing/publishing! It’s the mix of having great language skills and frequent usage. Alot of ppl who just use LLMs a lot are way worse detectors than they think they’ll be.

4 0 0 0

1 year ago

People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text In this paper, we study how well humans can detect text generated by commercial LLMs (GPT-4o, Claude, o1). We hire annotators to read 300 non-fiction English articles, label them as either human-writt...

Link found in last post of thread 😀 (but putting it here again) arxiv.org/abs/2501.15654

2 0 1 0

1 year ago

GitHub - jenna-russell/human_detectors Contribute to jenna-russell/human_detectors development by creating an account on GitHub.

📎 Paper: arxiv.org/abs/2501.15654
👩‍💻 Code & Data: github.com/jenna-russe...

Thanks to my amazing coauthors @markar.bsky.social and @miyyer.bsky.social and the support of UMass NLP

9 0 2 0

1 year ago

We're releasing our dataset of articles and expert annotations! 📂✨
We hope this helps users of automatic detectors understand not just if a text is AI-generated, but why. 🤖📖

3 0 1 0

1 year ago

Can LLMs mimic human expert detectors? 🤔

We prompted LLMs to imitate our expert annotators. The results show promise, outperforming detectors like Binoculars and RADAR. 🚀 However, LLMs still fall short of matching our human experts and advanced detectors like Pangram. ⚖️👥

2 0 1 0

1 year ago

What they get wrong: ❌

Sometimes, humans get tripped up by:
📚 Common "AI vocab" words in human-written texts
✍️ Grammar mistakes they assume "AI wouldn’t make"
🌀🗣️ One expert was often fooled by o1's use of informal language - like slang, contractions, and colloquialisms.

7 2 1 0

1 year ago

What experts get right: ✅

They spot telltale signs of AI, like:
📚 "AI Vocab" (delve, crucial, vibrant ...)
🔄 Predictable sentence structure
🗨️ Quotes that feel too polished

For human-written content, they look for:
🎨 Creativity
🎭 Stylistic quirks
🌊 A natural & clear flow

13 3 1 0

1 year ago

Across GPT-4o, Claude, and o1 articles, experts correctly identified 99.3% of AI-generated content without misclassifying any human-written articles.🕵️‍♀️

Among automatic detectors, Pangram significantly outperformed the rest, missing only a few more texts than the experts. 🔍⚡

10 0 1 0