We found that frontier LLMs tend to be highly fluent and coherent, even though their linguistic diversity decreases after alignment.
Yes, we analyzed LLMs both before and after RLHF. We found that the CREATIVITY INDEX of LLMs decreases by an average of 30.1% after alignment, and this reduction is more significant at the verbatim level than the semantic level.
I'm not particularly familiar with this field, but here's a survey paper on jailbreak attacks and defenses against LLMs that might be relevant.
arxiv.org/pdf/2407.04295
We're curious: with LLMs having consumed vastly more text than any human could ever read—including the works of distinguished writers and historic figures—could they, by standing on the shoulders of giants, create novel text that reaches new heights of linguistic sophistication and creativity?
In our paper, we compare LLMs to professional human writers, ranging from world-renowned figures like Hemingway to less famous and newer-generation authors.
Check out our latest work "AI as Humanity's Salieri," featured in ✨News from Science ✨. Dive into how we quantify linguistic creativity and explore: Are LLMs 🤖 as creative as humans 👩🎓?
Link: www.science.org/content/arti...
Join us to explore how we quantify linguistic creativity by reconstructing text from web snippets and investigate: Are LLMs 🤖 as creative as humans 👩🎓?
The stream will be recorded—catch it later if you can't join live! 🚀
Excited to talk about our latest work, "AI as Humanity's Salieri," at Fireside Chat today at 7 PM PST! 🔥
app.ploutos.dev/streams/inno...
See our latest work on (among other things) machine text detection through linguistic creativity measurement!
This corresponds to our observations (in a different setting) of vocabulary collapse when models trained on their own outputs (basically all of RLHF)
bsky.app/profile/yoav...
Did you look at pre-post-training models?
(show some hyphen love ❤️)
Joint work with my amazing collaborators ✨: @melaniesclar.bsky.social, Skyler Hallinan, Niloofar Mireshghallah, Jiacheng Liu, Seungju Han, Allyson Ettinger, Liwei Jiang, Khyathi Chandu, @nouhadziri.bsky.social, Yejin Choi
Finally, the CREATIVITY INDEX proves to be a surprisingly effective criterion for zero-shot machine text detection, surpassing the strongest existing zero-shot system, DetectGPT, by 30.2%, and even outperforming the strongest supervised system, GhostBuster, in five out of six domains.
Furthermore, we explore creativity differences among various groups of humans. Despite in-group variance, famous authors of classic literature, like Hemingway and Dickens, exhibit the highest levels of creativity, consistent with their levels of renown.
Moreover, we found that RLHF dramatically reduces the CREATIVITY INDEX of LLMs, by an average of 30.1%. This reduction is more significant at the verbatim level than the semantic level, indicating that LLMs may have converged to certain linguistic style preferred by humans during alignment.
We found CREATIVITY INDEX of human authors—specifically professional writers and historical figures—is on average 66.2% higher than that of LLMs. This gap is consistent across various domains—novel snippets, modern poems, and speech transcripts—at both verbatim and semantic levels.
To compute CREATIVITY INDEX efficiently, we introduce DJ SEARCH, a novel dynamic programming algorithm that can efficiently search for verbatim and near-verbatim matches of text snippets (i.e. n-grams) from a given document against a vast reference corpus in linear runtime.
We define L-uniqueness for a text as the proportion of its words, outside of n-grams (n ≥ L), that appear in a vast reference corpus (e.g., RedPajama).
The CREATIVITY INDEX is then defined as the area under the L-uniqueness curve across a range of minimum n-gram lengths L.
TLDR: We found the seemingly remarkable creativity of LLMs 🤖can be attributable in large part to the creativity of human-written texts on the web. In contrast, works by distinguished human authors 👩🎓cannot be easily replicated by merely assembling snippets from other works.
Are LLMs 🤖 as creative as humans 👩🎓? Not quite!
Introducing CREATIVITY INDEX: a metric that quantifies the linguistic creativity of a text by reconstructing it from existing text snippets on the web. Spoiler: professional human writers like Hemingway are still far more creative than LLMs! 😲