For today's reading group, Lasse Jantsch presented "Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers" by Adam Karvonen et al. (2026)
Paper: arxiv.org/abs/2512.15674
#NLProc
For today's reading group, Lasse Jantsch presented "Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers" by Adam Karvonen et al. (2026)
Paper: arxiv.org/abs/2512.15674
#NLProc
We were thrilled to host @mtutek.bsky.social at our lab last week.
His talk "From Internals to Integrity: How Insights into Transformer LMs Improve Safety, Interpretability, and Explanation Faithfulness" led to great discussions! 👏
#Transformers #AISafety #ExplainableAI #MLResearch #NLProc
Spending the week in Paris at #IASEAI 2026, joining colleagues for conversations on the present and future of safe and ethical AI.
24.02.2026 14:18 — 👍 5 🔁 1 💬 0 📌 0
For today's reading group @arimuti.bsky.social presented "SPeCtrum: A Grounded Framework for Multidimensional Identity Representation in LLM-Based Agent" by @keyeun.bsky.social.
#NLProc #identity
This year’s workshop places a strong emphasis on preventive rather than reactive approaches to women’s online safety. We invite submissions presenting new findings, recent work, new ideas or previously published research (non-archival) that fits the workshop theme.
19.02.2026 07:37 — 👍 2 🔁 3 💬 1 📌 0
Call for abstracts! (Deadline: 17 March)
Submissions are now open for the second edition of Towards a Safer Web for Women Workshop, taking place on 26 May at the Web Science Conference 2026 in Braunschweig 🇩🇪.
👉 More info & submission details: tsww26.github.io
Honored to give my first keynote at #IRCDL2026 on February 19th.
I’ll talk about how LLMs have shifted from productivity tools to everyday sources of info & personal guidance and what that means for risk, trust, bias, and alignment.
ircdl2026.unimore.it
We were excited to host @naitian.org at today’s lab seminar for a talk on variation, semiotics, fashion, and style. A refreshing perspective at the intersection of sociolinguistics and NLP!
#NLProc
For today's reading group @elisabassignana.bsky.social presented "How AI Impacts Skill Formation" by Judy Hanwen Shen & Alex Tamkin (2026).
Paper: arxiv.org/pdf/2601.20245
#NLProc
Hello #NLProc #ACL2026NLP people. I am looking for **two emergency reviewers** in the Safety and Alignment in LLMs track for ACL/ARR.
Reviews are due Feb 15th. Please DM if interested and available.
Happy to offer drinks/food if you live in/pass by Lisbon ☀️
I'm looking for two emergency reviewers 🧑🚒👩🚒 for the ARR January Generalizability and Transfer track.
Please reach out if you have time & qualify for review or RT for visibility🙏🙏
Seems to be a common situation for ACs this round, but I'm also looking for two emergency reviewers for the January #ARR Evaluation and Resources track. I'd appreciate any help (reposts, encouragement, black magic...)
10.02.2026 11:15 — 👍 3 🔁 6 💬 0 📌 0
🧠For this week’s lab seminar, @boleima.bsky.social talked about how survey methodology can inform NLP research, from annotations to human–AI alignment.
#NLProc
For today's reading group, @deboranozza.bsky.social presented "LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users" by Elinor Poole-Dayan et al.
Paper: arxiv.org/pdf/2406.17737
#NLProc
Last week we welcomed Nikhil Sharma to our lab seminar for a talk on Information Seeking, Consumption and Dissemination with LLM-powered Information Systems.
#NLProc #HCI
For our weekly reading group, Afshin Karimi presented "What Social Media Use Do People Regret? An Analysis of 34K Smartphone Screenshots with Multimodal LLM". Super interesting discussion!
Paper: dl.acm.org/doi/10.1145/...
#NLProc
#TBT #NLProc
'SAFETYKIT: Measuring Safety in Open-domain Conversational Systems' by Dinan et al. (2022) introduces taxonomy for AI safety, assesses tools' limits.
#AIsafety
#MemoryModay #NLProc
@gattanasio.cc et al. study asks 'Is It Worth the (Environmental) Cost?' analyzing continuous training for language models. Balances benefits, environmental impacts, for responsible use. #AI #Sustainability
arxiv.org/pdf/2210.07365
Found and added under data/
20.01.2026 11:21 — 👍 5 🔁 2 💬 0 📌 0I included some test cases on GitHub, will look if I still have the ones we used in the paper.
20.01.2026 11:11 — 👍 4 🔁 2 💬 0 📌 0
If you are curious about the theoretical background, see
Hovy, D., Berg-Kirkpatrick, T., Vaswani, A., & Hovy E. (2013). Learning Whom to Trust With MACE. In: Proceedings of NAACL-HLT. ACL.
aclanthology.org/N13-1132.pdf
And for even more details:
aclanthology.org/Q18-1040.pdf
N/N
I always wanted to revisit it, port it from Java to Python & extend to continuous data, but never found the time.
Last week, I played around with Cursor – and got it all done in ~1 hour. 🤯
If you work with any response data that needs aggregation, give it a try—and let me know what you think!
4/N
MACE estimates:
1. Annotator reliability (who’s consistent?)
2. Item difficulty (which examples spark disagreement?)
3. The most likely aggregate label (the latent “best guess”)
That “side project” ended up powering hundreds of annotation projects over the years.
3/N
However, disagreement isn’t just noise—it’s information. It can mean an item is genuinely hard—or someone wasn’t paying attention. If only you knew whom to trust…
That summer, Taylor Berg-Kirkpatrick, Ashish Vaswani, and I built MACE (Multi-Annotator Competence Estimation).
2/N
🚨(Software) Update:
In my PhD, I had a side project to fix an annoying problem: when you ask 5 people to label the same thing, you often get different answers. But in ML (and lots of other analyses), you still need a single aggregated answer. Using the majority vote is easy–but often wrong.
1/N
The deadline is approaching! Join the team :)
20.01.2026 10:27 — 👍 5 🔁 2 💬 0 📌 0
This week at reading group 📚
@pranav-nlp.bsky.social presented "You Cannot Sound Like GPT": Signs of language discrimination and resistance in computer science publishing.
Paper: arxiv.org/abs/2505.08127
#NLProc
Thank you @belindazli.bsky.social for the great talk "Solving the Specification Problem through Interaction” at our weekly seminar!
#NLProc
⏳ Deadline approaching! We’re hiring 2 fully funded postdocs in #NLP.
Join the MilaNLP team and contribute to our upcoming research projects (SALMON & TOLD)
🔗 Details + how to apply: milanlproc.github.io/open_positio...
⏰ Deadline: Jan 31, 2026
🎉 MilaNLP 2025 Wrapped 🎉
Lots of learning, building , sharing, and growing together 🌱
#NLProc