(7/7) For more details, please check out our pre-print!
24.09.2025 13:21 β π 3 π 0 π¬ 0 π 0@chantalsh.bsky.social
PhD (in progress) @ Northeastern! NLP π€ LLMs she/her
(7/7) For more details, please check out our pre-print!
24.09.2025 13:21 β π 3 π 0 π¬ 0 π 0(6/7) LLMs are terrible at detecting their own slop: GPT-5, Deepseek-V3, and o3-mini rarely assign a label of "slop" (avg. 6% of documents), whereas humans marked 34% of texts as "slop."
24.09.2025 13:21 β π 4 π 0 π¬ 1 π 1(5/7) We lack good/reliable automatic text metrics for 3 of the 5 most important slop features: relevance, coherence, and tone. :-(
24.09.2025 13:21 β π 1 π 0 π¬ 1 π 0(4/7) Different domains have different slop signatures. In news articles, coherence, density, relevance, and tone issues predict slop. In Q&A tasks, it's factuality and structure. Context matters!
24.09.2025 13:21 β π 1 π 0 π¬ 1 π 0(3/7) Humans can spot "sloppy text", but may have differing thresholds on overall assessments. But our annotators consistently flagged the same problematic passages, suggesting we know it when we see it...
24.09.2025 13:21 β π 1 π 0 π¬ 1 π 0(2/7) TL;DR: Measuring the construct of slop is difficult! While somewhat subjective and domain-dependent, it boils down to three key factors: information quality, density, and stylistic choices. We introduce a taxonomy for slop.
24.09.2025 13:21 β π 2 π 0 π¬ 1 π 0"AI slop" seems to be everywhere, but what exactly makes text feel like "slop"?
In our new work (w/ @tuhinchakr.bsky.social, Diego Garcia-Olano, @byron.bsky.social ) we provide a systematic attempt at measuring AI "slop" in text!
arxiv.org/abs/2509.19163
π§΅ (1/7)
(5/7) We lack good/reliable automatic text metrics for 3 of the 5 most important slop features: relevance, coherence, and tone. :-(
24.09.2025 13:18 β π 0 π 0 π¬ 0 π 0(4/7) Different domains have different slop signatures. In news articles, coherence, density, relevance, and tone issues predict slop. In Q&A tasks, it's factuality and structure. Context matters!
24.09.2025 13:18 β π 0 π 0 π¬ 1 π 0(3/7) Humans can spot "sloppy text", but may have differing thresholds on overall assessments. But our annotators consistently flagged the same problematic passages, suggesting we know it when we see it...
24.09.2025 13:18 β π 0 π 0 π¬ 1 π 0(2/7) TL;DR: Measuring the construct of slop is difficult! While somewhat subjective and domain-dependent, it boils down to three key factors: information quality, density, and stylistic choices. We introduce a taxonomy for slop.
24.09.2025 13:18 β π 0 π 0 π¬ 1 π 0I'm searching for some comp/ling experts to provide a precise definition of βslopβ as it refers to text (see: corp.oup.com/word-of-the-...)
I put together a google form that should take no longer than 10 minutes to complete: forms.gle/oWxsCScW3dJU...
If you can help, I'd appreciate your input! π
π’ Can we trace a small distilled model back to its teacher? π€New work (w/ @chantalsh.bsky.social, @silvioamir.bsky.social & @byron.bsky.social) finds some footprints left by LLMs in distillation! [1/6]
π Full paper: arxiv.org/abs/2502.06659