OWL: Probing Cross-Lingual Recall of Memorized Texts via World Literature
Large language models (LLMs) are known to memorize and recall English text from their pretraining data. However, the extent to which this ability generalizes to non-English languages or transfers...
Check out our dataset and findings more in details๐
๐ github.com/emirkaan5/OWL/
๐ arxiv.org/abs/2505.22945
Work done at @UMassNLP by @alishasrivas.bsky.socialโฌโฌ, @emirdotexe.bsky.socialโฌ, @nhminhle.bsky.socialโฌ, @chautmpham.bsky.social, @markar.bsky.social, and @miyyer.bsky.social.
30.05.2025 15:37 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 0
๐งฎ Does quantization (GPTQ) impact cross-lingual knowledge transfer?
LLaMA-3.1-70B: 4-bit > 8-bit โฉ accuracy drops MORE at 8-bit (up to 25%)
LLaMA-3.1-8B: 8-bit > 4-bit โฉ accuracy drops MORE at 4-bit (up to 8%)
๐ง Bigger models arenโt always more robust to quantization.
30.05.2025 15:37 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
๐ LLMs can transfer knowledge across modalities (Text โ Audio).
On GPT-4o-Audio vs Text:
๐ Direct Probing: 75.5% (vs. 92.3%)
๐ค Name Cloze: 15.9% (vs. 38.6%)
โ๏ธ Prefix Probing: 27.2% (vs. 22.6%)
Qwen-Omni shows similar trends but lower accuracy.
30.05.2025 15:37 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
What if we perturb the text?
๐งฉ shuffled text
๐ญ masked character names
๐
๐ปโโ๏ธ passages w/o characters
๐จReduce accuracy with the degree varying across languages BUT models can still identify the books better than newly published books (0.1%) ๐จ
30.05.2025 15:37 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
LLMs can identify book titles and authors across languages - even those not seen during pre-training:
63.8% accuracy on English passages
47.2% on official translations (Spanish, Turkish, Vietnamese) ๐ช๐ธ ๐น๐ท ๐ป๐ณ
36.5% on completely unseen languages like Sesotho & Maithili ๐
30.05.2025 15:37 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
OWL has aligned excerpts from 20 EN novels, with translations in ES ๐ช๐ธ, TR ๐น๐ท, VI ๐ป๐ณ + 6 new low-resource languages ๐ & EN audio ๐
We probe LLMs to:
1๏ธโฃ identify book/author (direct probing)
2๏ธโฃ predict masked names (name cloze)
3๏ธโฃ generate continuation (prefix probing)
30.05.2025 15:37 โ ๐ 3 ๐ 0 ๐ฌ 1 ๐ 0
LLMs memorize novels ๐ in English. But what about existing translations? Or translations into new languages?
Our ๐ฆOWL dataset (31K/10 languages) shows GPT4o recognizes books:
92% English
83% official translations
69% unseen translations
75% as audio (EN)
30.05.2025 15:37 โ ๐ 7 ๐ 2 ๐ฌ 1 ๐ 3