Debora Nozza deboranozza - Bluesky Statics

Capturing Regional Variation with Distributed Place Representations and Geographic Retrofitting Dirk Hovy, Christoph Purschke. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.

#TBT #NLProc 'Capturing Regional Variation with Distributed Place Representations and Geographic Retrofitting' by @dirkhovy.bsky.social and Christoph Purschke (2018) highlights how social class and background impact technology performance. #TechInclusion

05.03.2026 16:00 — 👍 2 🔁 3 💬 0 📌 1

For today's reading group, Lasse Jantsch presented "Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers" by Adam Karvonen et al. (2026)

Paper: arxiv.org/abs/2512.15674

#NLProc

26.02.2026 13:08 — 👍 4 🔁 1 💬 0 📌 0

We were thrilled to host @mtutek.bsky.social at our lab last week.
His talk "From Internals to Integrity: How Insights into Transformer LMs Improve Safety, Interpretability, and Explanation Faithfulness" led to great discussions! 👏
#Transformers #AISafety #ExplainableAI #MLResearch #NLProc

24.02.2026 14:05 — 👍 18 🔁 3 💬 0 📌 0

Spending the week in Paris at #IASEAI 2026, joining colleagues for conversations on the present and future of safe and ethical AI.

24.02.2026 14:18 — 👍 5 🔁 1 💬 0 📌 0

For today's reading group @arimuti.bsky.social presented "SPeCtrum: A Grounded Framework for Multidimensional Identity Representation in LLM-Based Agent" by @keyeun.bsky.social.

#NLProc #identity

19.02.2026 17:27 — 👍 5 🔁 2 💬 0 📌 0

This year’s workshop places a strong emphasis on preventive rather than reactive approaches to women’s online safety. We invite submissions presenting new findings, recent work, new ideas or previously published research (non-archival) that fits the workshop theme.

19.02.2026 07:37 — 👍 2 🔁 3 💬 1 📌 0

Towards a Safer Web for Women: Second International Workshop on Protecting Women Online

Call for abstracts! (Deadline: 17 March)

Submissions are now open for the second edition of Towards a Safer Web for Women Workshop, taking place on 26 May at the Web Science Conference 2026 in Braunschweig 🇩🇪.

👉 More info & submission details: tsww26.github.io

19.02.2026 07:37 — 👍 4 🔁 3 💬 1 📌 0

Honored to give my first keynote at #IRCDL2026 on February 19th.

I’ll talk about how LLMs have shifted from productivity tools to everyday sources of info & personal guidance and what that means for risk, trust, bias, and alignment.

ircdl2026.unimore.it

17.02.2026 10:22 — 👍 14 🔁 2 💬 0 📌 0

We were excited to host @naitian.org at today’s lab seminar for a talk on variation, semiotics, fashion, and style. A refreshing perspective at the intersection of sociolinguistics and NLP!

#NLProc

13.02.2026 16:55 — 👍 8 🔁 3 💬 0 📌 0

For today's reading group @elisabassignana.bsky.social presented "How AI Impacts Skill Formation" by Judy Hanwen Shen & Alex Tamkin (2026).

Paper: arxiv.org/pdf/2601.20245

#NLProc

12.02.2026 11:52 — 👍 7 🔁 2 💬 0 📌 0

Hello #NLProc #ACL2026NLP people. I am looking for **two emergency reviewers** in the Safety and Alignment in LLMs track for ACL/ARR.

Reviews are due Feb 15th. Please DM if interested and available.

Happy to offer drinks/food if you live in/pass by Lisbon ☀️

10.02.2026 14:59 — 👍 6 🔁 10 💬 0 📌 0

I'm looking for two emergency reviewers 🧑‍🚒👩‍🚒 for the ARR January Generalizability and Transfer track.

Please reach out if you have time & qualify for review or RT for visibility🙏🙏

10.02.2026 11:43 — 👍 2 🔁 6 💬 0 📌 0

Seems to be a common situation for ACs this round, but I'm also looking for two emergency reviewers for the January #ARR Evaluation and Resources track. I'd appreciate any help (reposts, encouragement, black magic...)

10.02.2026 11:15 — 👍 3 🔁 6 💬 0 📌 0

🧠For this week’s lab seminar, @boleima.bsky.social talked about how survey methodology can inform NLP research, from annotations to human–AI alignment.

#NLProc

06.02.2026 16:37 — 👍 8 🔁 3 💬 0 📌 0

For today's reading group, @deboranozza.bsky.social presented "LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users" by Elinor Poole-Dayan et al.

Paper: arxiv.org/pdf/2406.17737

#NLProc

05.02.2026 11:49 — 👍 5 🔁 1 💬 0 📌 0

Last week we welcomed Nikhil Sharma to our lab seminar for a talk on Information Seeking, Consumption and Dissemination with LLM-powered Information Systems.

#NLProc #HCI

02.02.2026 11:11 — 👍 7 🔁 2 💬 0 📌 0

For our weekly reading group, Afshin Karimi presented "What Social Media Use Do People Regret? An Analysis of 34K Smartphone Screenshots with Multimodal LLM". Super interesting discussion!

Paper: dl.acm.org/doi/10.1145/...

#NLProc

29.01.2026 15:09 — 👍 6 🔁 3 💬 0 📌 0

#TBT #NLProc
'SAFETYKIT: Measuring Safety in Open-domain Conversational Systems' by Dinan et al. (2022) introduces taxonomy for AI safety, assesses tools' limits.
#AIsafety

29.01.2026 16:22 — 👍 3 🔁 1 💬 0 📌 0

#MemoryModay #NLProc
@gattanasio.cc et al. study asks 'Is It Worth the (Environmental) Cost?' analyzing continuous training for language models. Balances benefits, environmental impacts, for responsible use. #AI #Sustainability

arxiv.org/pdf/2210.07365

26.01.2026 17:10 — 👍 7 🔁 3 💬 0 📌 1

Found and added under data/

20.01.2026 11:21 — 👍 5 🔁 2 💬 0 📌 0

I included some test cases on GitHub, will look if I still have the ones we used in the paper.

20.01.2026 11:11 — 👍 4 🔁 2 💬 0 📌 0

If you are curious about the theoretical background, see

Hovy, D., Berg-Kirkpatrick, T., Vaswani, A., & Hovy E. (2013). Learning Whom to Trust With MACE. In: Proceedings of NAACL-HLT. ACL.

aclanthology.org/N13-1132.pdf

And for even more details:

aclanthology.org/Q18-1040.pdf

N/N

20.01.2026 10:20 — 👍 8 🔁 2 💬 1 📌 0

I always wanted to revisit it, port it from Java to Python & extend to continuous data, but never found the time.
Last week, I played around with Cursor – and got it all done in ~1 hour. 🤯

If you work with any response data that needs aggregation, give it a try—and let me know what you think!

4/N

20.01.2026 10:17 — 👍 12 🔁 2 💬 1 📌 0

MACE estimates:
1. Annotator reliability (who’s consistent?)
2. Item difficulty (which examples spark disagreement?)
3. The most likely aggregate label (the latent “best guess”)

That “side project” ended up powering hundreds of annotation projects over the years.

3/N

20.01.2026 10:15 — 👍 10 🔁 2 💬 1 📌 0

However, disagreement isn’t just noise—it’s information. It can mean an item is genuinely hard—or someone wasn’t paying attention. If only you knew whom to trust…

That summer, Taylor Berg-Kirkpatrick, Ashish Vaswani, and I built MACE (Multi-Annotator Competence Estimation).

2/N

20.01.2026 10:14 — 👍 13 🔁 2 💬 1 📌 0

GitHub - dirkhovy/MACE: Multi-Annotator Competence Estimation tool Multi-Annotator Competence Estimation tool. Contribute to dirkhovy/MACE development by creating an account on GitHub.

🚨(Software) Update:

In my PhD, I had a side project to fix an annoying problem: when you ask 5 people to label the same thing, you often get different answers. But in ML (and lots of other analyses), you still need a single aggregated answer. Using the majority vote is easy–but often wrong.

1/N

20.01.2026 10:12 — 👍 75 🔁 14 💬 6 📌 0

The deadline is approaching! Join the team :)

20.01.2026 10:27 — 👍 5 🔁 2 💬 0 📌 0

This week at reading group 📚
@pranav-nlp.bsky.social presented "You Cannot Sound Like GPT": Signs of language discrimination and resistance in computer science publishing.

Paper: arxiv.org/abs/2505.08127

#NLProc

23.01.2026 13:35 — 👍 10 🔁 6 💬 1 📌 0

Thank you @belindazli.bsky.social for the great talk "Solving the Specification Problem through Interaction” at our weekly seminar!

#NLProc

23.01.2026 16:26 — 👍 7 🔁 2 💬 0 📌 0

⏳ Deadline approaching! We’re hiring 2 fully funded postdocs in #NLP.

Join the MilaNLP team and contribute to our upcoming research projects (SALMON & TOLD)

🔗 Details + how to apply: milanlproc.github.io/open_positio...

⏰ Deadline: Jan 31, 2026

19.01.2026 17:24 — 👍 11 🔁 10 💬 0 📌 1

Posts by Debora Nozza (@deboranozza.bsky.social)