Ximing Lu (@gximing) — Bluesky Profile

1 year ago

We found that frontier LLMs tend to be highly fluent and coherent, even though their linguistic diversity decreases after alignment.

2 0 0 0

1 year ago

Yes, we analyzed LLMs both before and after RLHF. We found that the CREATIVITY INDEX of LLMs decreases by an average of 30.1% after alignment, and this reduction is more significant at the verbatim level than the semantic level.

3 0 0 0

1 year ago

I'm not particularly familiar with this field, but here's a survey paper on jailbreak attacks and defenses against LLMs that might be relevant.

arxiv.org/pdf/2407.04295

0 0 0 0

1 year ago

We're curious: with LLMs having consumed vastly more text than any human could ever read—including the works of distinguished writers and historic figures—could they, by standing on the shoulders of giants, create novel text that reaches new heights of linguistic sophistication and creativity?

0 0 0 0

1 year ago

In our paper, we compare LLMs to professional human writers, ranging from world-renowned figures like Hemingway to less famous and newer-generation authors.

0 0 1 0

1 year ago

AI writing is improving, but it still can’t match human creativity Computer program finds that ChatGPT and its ilk remix words well, yet their output remains derivative

Check out our latest work "AI as Humanity's Salieri," featured in ✨News from Science ✨. Dive into how we quantify linguistic creativity and explore: Are LLMs 🤖 as creative as humans 👩‍🎓?

Link: www.science.org/content/arti...

2 0 0 0

1 year ago

Join us to explore how we quantify linguistic creativity by reconstructing text from web snippets and investigate: Are LLMs 🤖 as creative as humans 👩‍🎓?

The stream will be recorded—catch it later if you can't join live! 🚀

0 0 0 0

1 year ago

Excited to talk about our latest work, "AI as Humanity's Salieri," at Fireside Chat today at 7 PM PST! 🔥

app.ploutos.dev/streams/inno...

1 0 1 0

1 year ago

See our latest work on (among other things) machine text detection through linguistic creativity measurement!

3 1 0 0

1 year ago

This corresponds to our observations (in a different setting) of vocabulary collapse when models trained on their own outputs (basically all of RLHF)
bsky.app/profile/yoav...

Did you look at pre-post-training models?
(show some hyphen love ❤️)

6 3 2 0

1 year ago

Check out more details here: arxiv.org/pdf/2410.04265

3 0 0 0

1 year ago

Joint work with my amazing collaborators ✨: @melaniesclar.bsky.social, Skyler Hallinan, Niloofar Mireshghallah, Jiacheng Liu, Seungju Han, Allyson Ettinger, Liwei Jiang, Khyathi Chandu, @nouhadziri.bsky.social, Yejin Choi

1 0 1 0

1 year ago

Finally, the CREATIVITY INDEX proves to be a surprisingly effective criterion for zero-shot machine text detection, surpassing the strongest existing zero-shot system, DetectGPT, by 30.2%, and even outperforming the strongest supervised system, GhostBuster, in five out of six domains.

1 0 1 0

1 year ago

Furthermore, we explore creativity differences among various groups of humans. Despite in-group variance, famous authors of classic literature, like Hemingway and Dickens, exhibit the highest levels of creativity, consistent with their levels of renown.

1 0 1 0

1 year ago

Moreover, we found that RLHF dramatically reduces the CREATIVITY INDEX of LLMs, by an average of 30.1%. This reduction is more significant at the verbatim level than the semantic level, indicating that LLMs may have converged to certain linguistic style preferred by humans during alignment.

2 0 1 1

1 year ago

We found CREATIVITY INDEX of human authors—specifically professional writers and historical figures—is on average 66.2% higher than that of LLMs. This gap is consistent across various domains—novel snippets, modern poems, and speech transcripts—at both verbatim and semantic levels.

1 0 1 0

1 year ago

To compute CREATIVITY INDEX efficiently, we introduce DJ SEARCH, a novel dynamic programming algorithm that can efficiently search for verbatim and near-verbatim matches of text snippets (i.e. n-grams) from a given document against a vast reference corpus in linear runtime.

2 0 1 0

1 year ago

We define L-uniqueness for a text as the proportion of its words, outside of n-grams (n ≥ L), that appear in a vast reference corpus (e.g., RedPajama).

The CREATIVITY INDEX is then defined as the area under the L-uniqueness curve across a range of minimum n-gram lengths L.

2 0 2 0

1 year ago

TLDR: We found the seemingly remarkable creativity of LLMs 🤖can be attributable in large part to the creativity of human-written texts on the web. In contrast, works by distinguished human authors 👩‍🎓cannot be easily replicated by merely assembling snippets from other works.

3 0 1 0

1 year ago

Are LLMs 🤖 as creative as humans 👩‍🎓? Not quite!

Introducing CREATIVITY INDEX: a metric that quantifies the linguistic creativity of a text by reconstructing it from existing text snippets on the web. Spoiler: professional human writers like Hemingway are still far more creative than LLMs! 😲

43 6 3 3