📢 New paper accepted at @eaclmeeting.bsky.social
2026:
Persistent Personas? Role-Playing, Instruction Following, and Safety in Extended Interactions
with
@mhedderich.bsky.social
@amodarressi.bsky.social
Hinrich Schuetze
& Benjamin Roth.
Preprint: arxiv.org/abs/2512.12775
🧑🔬I’m recruiting PhD students in Natural Language Processing @unileipzig.bsky.social Computer Science, together with @scadsai.bsky.social!
Topics include, but aren’t limited to:
🔎Linguistic Interpretability
🌍Multilingual Evaluation
📖Computational Typology
Please share!
#NLProc #NLP
CIS & MaiNLP Group picture at EMNLP 2025! 🤩 🤗 (1/3)
While I sadly 🥲 won't be at EMNLP this year myself, please do reach out to any of our members for a chat if you are interested in our research!
We also co-organize and participate in some great workshops at EMNLP:
Excited to be here in Suzhou for #EMNLP2025!
I’ll be presenting “ImpliRet”, check out our poster on Friday Nov. 7th at 14:00.
If you’re into long-context, IR, or just want to chat, come *Pay Ali* a visit 😁
Link to thread:
x.com/zeinabtaghav...
Details on poster times and locations coming soon.
Would love to meet and chat ☕️💬
If you’re attending #ACL2025, feel free to stop by and say hi! 👋
🧵[4/4]
⏱️🔎 Time Course MechInterp
We track how factual knowledge forms in OLMo over training by analyzing the evolving roles of Attention Heads and FFNs.
Heads are dynamic and often repurposed; FFNs are stable and keep refining facts.
By: A. Dawar Hakimi
arxiv.org/abs/2506.03434
🧵[3/4]
🌐 MEXA: Multilingual Evaluation of English-Centric LLMs
A method for assessing the multilingual capabilities of English-centric LLMs using parallel sentences. It estimates how many languages an LLM covers and at what level.
By: @kargaranamir.bsky.social
x.com/amir_nlp/sta...
🧵[2/4]
Leaving Vancouver after ICML’s closing fireworks 😁🎆
Heading to Toronto for a few days, then off to
@aclmeeting.bsky.social to present:
"Collapse of Dense Retrievers"
A work by @mohsen-fayyaz.bsky.social that I was fortunate to collaborate on.
Also co-presenting two other papers…🧵 [1/4]
Check out the paper & our GitHub repo (with results on recent models 🆕✨)!
📄: arxiv.org/abs/2502.05167
🔗: github.com/adobe-resear...
🤗: huggingface.co/datasets/amo...
This work was my internship project at
@adobe.com, in collaboration with my mentors there and Hinrich Schütze.
I’ll be at @icmlconf.bsky.social next week presenting NoLiMa!
Poster on Tue July 15, 4:30–7pm (E-2312).
Happy to grab a coffee and chat about long-context, memory, research, or just to catch up.
I’ll be in Toronto for a couple of days after the conference, let me know if you’re around!
MemLLM: Finetuning LLMs to Use Explicit Read-Write Memory
Ali Modarressi, Abdullatif Köksal, Ayyoob Imani, Mohsen Fayyaz, Hinrich Schuetze
Action editor: Greg Durrett
https://openreview.net/forum?id=dghM7sOudh
#memory #memorizing #memllm
The takeaway? we need robust retrievers that prioritize answer relevance, not just heuristic shortcuts.
work with an amazing team:
@mohsen-fayyaz.bsky.social,
Hinrich Schütze,
@violetpeng.bsky.social
paper: arxiv.org/abs/2503.05037
dataset 🤗: t.co/QZFyCLqP0P
Cross-post from x.com/mohsen_fayyaz
We also analyze RAG: biased retrievers can mislead LLMs, degrading their performance by 34%, worse than retrieving nothing! 😮
When multiple biases combine, retrievers fail catastrophically:
📉 Answer-containing docs ranked <3% of the time over a synthetic biased doc with no answer!
Dense retrievers are crucial for RAG and search, but do they actually retrieve useful evidence? 🤔
We design controlled experiments by repurposing a relation extraction dataset, exposing serious flaws in models like Dragon+ and Contriever.
📄 Collapse of Dense Retrievers
Accepted to #ACL2025 main conference 🎉🎉
In this paper we uncover major vulnerabilities in dense retrievers like Contriever, showing they favor:
📌 Shorter docs
📌 Early positions
📌 Repeated entities
📌 Literal matches
...all while ignoring the answer's presence!