For today's reading group, Lasse Jantsch presented "Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers" by Adam Karvonen et al. (2026)
Paper: arxiv.org/abs/2512.15674
#NLProc
For today's reading group, Lasse Jantsch presented "Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers" by Adam Karvonen et al. (2026)
Paper: arxiv.org/abs/2512.15674
#NLProc
Spending the week in Paris at #IASEAI 2026, joining colleagues for conversations on the present and future of safe and ethical AI.
24.02.2026 14:18 β π 5 π 1 π¬ 0 π 0
We were thrilled to host @mtutek.bsky.social at our lab last week.
His talk "From Internals to Integrity: How Insights into Transformer LMs Improve Safety, Interpretability, and Explanation Faithfulness" led to great discussions! π
#Transformers #AISafety #ExplainableAI #MLResearch #NLProc
Call for abstracts! (Deadline: 17 March)
Submissions are now open for the second edition of Towards a Safer Web for Women Workshop, taking place on 26 May at the Web Science Conference 2026 in Braunschweig π©πͺ.
π More info & submission details: tsww26.github.io
This yearβs workshop places a strong emphasis on preventive rather than reactive approaches to womenβs online safety. We invite submissions presenting new findings, recent work, new ideas or previously published research (non-archival) that fits the workshop theme.
19.02.2026 07:37 β π 2 π 3 π¬ 1 π 0
For today's reading group @arimuti.bsky.social presented "SPeCtrum: A Grounded Framework for Multidimensional Identity Representation in LLM-Based Agent" by @keyeun.bsky.social.
#NLProc #identity
Honored to give my first keynote at #IRCDL2026 on February 19th.
Iβll talk about how LLMs have shifted from productivity tools to everyday sources of info & personal guidance and what that means for risk, trust, bias, and alignment.
ircdl2026.unimore.it
We were excited to host @naitian.org at todayβs lab seminar for a talk on variation, semiotics, fashion, and style. A refreshing perspective at the intersection of sociolinguistics and NLP!
#NLProc
For today's reading group @elisabassignana.bsky.social presented "How AI Impacts Skill Formation" by Judy Hanwen Shen & Alex Tamkin (2026).
Paper: arxiv.org/pdf/2601.20245
#NLProc
π§ For this weekβs lab seminar, @boleima.bsky.social talked about how survey methodology can inform NLP research, from annotations to humanβAI alignment.
#NLProc
For today's reading group, @deboranozza.bsky.social presented "LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users" by Elinor Poole-Dayan et al.
Paper: arxiv.org/pdf/2406.17737
#NLProc
Last week we welcomed Nikhil Sharma to our lab seminar for a talk on Information Seeking, Consumption and Dissemination with LLM-powered Information Systems.
#NLProc #HCI
#TBT #NLProc
'SAFETYKIT: Measuring Safety in Open-domain Conversational Systems' by Dinan et al. (2022) introduces taxonomy for AI safety, assesses tools' limits.
#AIsafety
For our weekly reading group, Afshin Karimi presented "What Social Media Use Do People Regret? An Analysis of 34K Smartphone Screenshots with Multimodal LLM". Super interesting discussion!
Paper: dl.acm.org/doi/10.1145/...
#NLProc
New year, new job? If that is your current mantra, check the open postdoc positions with Debora Nozza and me at our lab. Deadline is January 31st.
milanlproc.github.io/open_positio...
#MemoryModay #NLProc
@gattanasio.cc et al. study asks 'Is It Worth the (Environmental) Cost?' analyzing continuous training for language models. Balances benefits, environmental impacts, for responsible use. #AI #Sustainability
arxiv.org/pdf/2210.07365
The deadline is approaching! Join the team :)
20.01.2026 10:27 β π 5 π 2 π¬ 0 π 0
π¨(Software) Update:
In my PhD, I had a side project to fix an annoying problem: when you ask 5 people to label the same thing, you often get different answers. But in ML (and lots of other analyses), you still need a single aggregated answer. Using the majority vote is easyβbut often wrong.
1/N
However, disagreement isnβt just noiseβitβs information. It can mean an item is genuinely hardβor someone wasnβt paying attention. If only you knew whom to trustβ¦
That summer, Taylor Berg-Kirkpatrick, Ashish Vaswani, and I built MACE (Multi-Annotator Competence Estimation).
2/N
MACE estimates:
1. Annotator reliability (whoβs consistent?)
2. Item difficulty (which examples spark disagreement?)
3. The most likely aggregate label (the latent βbest guessβ)
That βside projectβ ended up powering hundreds of annotation projects over the years.
3/N
I always wanted to revisit it, port it from Java to Python & extend to continuous data, but never found the time.
Last week, I played around with Cursor β and got it all done in ~1 hour. π€―
If you work with any response data that needs aggregation, give it a tryβand let me know what you think!
4/N
If you are curious about the theoretical background, see
Hovy, D., Berg-Kirkpatrick, T., Vaswani, A., & Hovy E. (2013). Learning Whom to Trust With MACE. In: Proceedings of NAACL-HLT. ACL.
aclanthology.org/N13-1132.pdf
And for even more details:
aclanthology.org/Q18-1040.pdf
N/N
I included some test cases on GitHub, will look if I still have the ones we used in the paper.
20.01.2026 11:11 β π 4 π 2 π¬ 0 π 0Found and added under data/
20.01.2026 11:21 β π 5 π 2 π¬ 0 π 0
π Weβre opening 2 fully funded postdoc positions in #NLP!
Join the MilaNLP team and contribute to our upcoming research projects.
π More details: milanlproc.github.io/open_positio...
β° Deadline: Jan 31, 2026
For today's reading group, Serena Pugliese presented the paper βAdversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models" by Piercosma Bisconti et al. (2025).
Paper: arxiv.org/pdf/2511.15304
#NLProc
#LLMs #jailbreaking
We're also back with the lab's seminar! Today we had Eleonora Mancini presenting her doctoral research "Multimodal AI for Human Expression Understanding".
#NLP #multimodality #speech
π MilaNLP 2025 Wrapped π
Lots of learning, building , sharing, and growing together π±
#NLProc
β³ Deadline approaching! Weβre hiring 2 fully funded postdocs in #NLP.
Join the MilaNLP team and contribute to our upcoming research projects (SALMON & TOLD)
π Details + how to apply: milanlproc.github.io/open_positio...
β° Deadline: Jan 31, 2026
Thank you @belindazli.bsky.social for the great talk "Solving the Specification Problem through Interactionβ at our weekly seminar!
#NLProc