Table titled โTaxonomy for evaluation of AI in mental health applications,โ organized into columns for quality criteria (validity and reliability) and real-world use (implementation and maintenance). Rows distinguish support types: assessment, intervention, and information synthesis. Each cell lists detailed evaluation questions, such as construct and criterion validity, consistency across populations and time, feasibility, effectiveness, usability, acceptability, safety, and unintended consequences, providing a structured framework for assessing AI systems in mental health contexts.
๐๐งฉ ๐๐ฒ๐๐ผ๐ป๐ฑ ๐๐ฒ๐ป๐ฐ๐ต๐บ๐ฎ๐ฟ๐ธ๐: ๐๐ผ๐ ๐๐ผ ๐๐๐ฎ๐น๐๐ฎ๐๐ฒ ๐ ๐ฒ๐ป๐๐ฎ๐น ๐๐ฒ๐ฎ๐น๐๐ต ๐๐ ๐ฅ๐ฒ๐๐ฝ๐ผ๐ป๐๐ถ๐ฏ๐น๐
AI for mental health is a high-stakes area: its evaluation needs to meet the highest expectations.
The new preprint ๐๐ฆ๐ด๐ฑ๐ฐ๐ฏ๐ด๐ช๐ฃ๐ญ๐ฆ ๐๐ท๐ข๐ญ๐ถ๐ข๐ต๐ช๐ฐ๐ฏ ๐ฐ๐ง ๐๐ ๐ง๐ฐ๐ณ ๐๐ฆ๐ฏ๐ต๐ข๐ญ ๐๐ฆ๐ข๐ญ๐ต๐ฉ, written by an interdisciplinary team spanning AI [...]
19.02.2026 09:46 โ ๐ 3 ๐ 3 ๐ฌ 1 ๐ 0
Honored to give my first keynote at #IRCDL2026 on February 19th.
Iโll talk about how LLMs have shifted from productivity tools to everyday sources of info & personal guidance and what that means for risk, trust, bias, and alignment.
ircdl2026.unimore.it
17.02.2026 10:22 โ ๐ 14 ๐ 2 ๐ฌ 0 ๐ 0
The image displays the words "Politics & Gender" in yellow text on a green background, with the hashtag "#OpenAccess" below it. A vertical yellow stripe is on the left side.
#OpenAccess from @politicsgenderj.bsky.social -
Male Agency? Analyzing Fatherhood Roles in Swedish Parliamentary Documents, 1993โ2021 - https://cup.org/40el36q
- Lena Wรคngnerud, Elin Naurin, @dirkhovy.bsky.social, Lorenzo Lupo & Oscar Magnusson
#FirstView
17.02.2026 05:20 โ ๐ 6 ๐ 2 ๐ฌ 0 ๐ 0
#MemoryModay #NLProc
@gattanasio.cc et al. study asks 'Is It Worth the (Environmental) Cost?' analyzing continuous training for language models. Balances benefits, environmental impacts, for responsible use. #AI #Sustainability
arxiv.org/pdf/2210.07365
26.01.2026 17:10 โ ๐ 7 ๐ 3 ๐ฌ 0 ๐ 1
Tutorials and Resources โ CSS @ IP-Paris
Site web de l'axe sciences sociales computationnelles du CREST-CNRS. Cours et tutoriels pour l'analyse des donnรฉes numรฉriques en sciences sociales.
What are the main issues discussed in a set of documents?
Weโve just released a step-by-step BERTopic tutorial.
We also launch a new page, gathering various NLP tutorials for social scientists.
๐ www.css.cnrs.fr/tutorials-an...
27.01.2026 15:16 โ ๐ 48 ๐ 21 ๐ฌ 3 ๐ 4
Citation is the foundation of academic promotion. Itโs noisy, sure, but its integrity is worth fighting for. Hallucinated citations should be a desk reject.
22.01.2026 01:16 โ ๐ 27 ๐ 5 ๐ฌ 1 ๐ 0
CSE 598-004 - Building Small Language Models
The second new class I'm teaching is a very experimental graduate level seminar in CSE: "Building Small Language Models". I taught the grad level NLP class last semester (so fun!) but students wanted moreโwhich of these new ideas work, and which work for SLMs? jurgens.people.si.umich.edu/CSE598-004/
19.01.2026 21:29 โ ๐ 32 ๐ 9 ๐ฌ 2 ๐ 1
๐ MilaNLP 2025 Wrapped ๐
Lots of learning, building , sharing, and growing together ๐ฑ
#NLProc
20.01.2026 11:15 โ ๐ 10 ๐ 4 ๐ฌ 0 ๐ 0
Found and added under data/
20.01.2026 11:21 โ ๐ 5 ๐ 2 ๐ฌ 0 ๐ 0
I included some test cases on GitHub, will look if I still have the ones we used in the paper.
20.01.2026 11:11 โ ๐ 4 ๐ 2 ๐ฌ 0 ๐ 0
โณ Deadline approaching! Weโre hiring 2 fully funded postdocs in #NLP.
Join the MilaNLP team and contribute to our upcoming research projects (SALMON & TOLD)
๐ Details + how to apply: milanlproc.github.io/open_positio...
โฐ Deadline: Jan 31, 2026
19.01.2026 17:24 โ ๐ 11 ๐ 10 ๐ฌ 0 ๐ 1
If you are curious about the theoretical background, see
Hovy, D., Berg-Kirkpatrick, T., Vaswani, A., & Hovy E. (2013). Learning Whom to Trust With MACE. In: Proceedings of NAACL-HLT. ACL.
aclanthology.org/N13-1132.pdf
And for even more details:
aclanthology.org/Q18-1040.pdf
N/N
20.01.2026 10:20 โ ๐ 8 ๐ 2 ๐ฌ 1 ๐ 0
I always wanted to revisit it, port it from Java to Python & extend to continuous data, but never found the time.
Last week, I played around with Cursor โ and got it all done in ~1 hour. ๐คฏ
If you work with any response data that needs aggregation, give it a tryโand let me know what you think!
4/N
20.01.2026 10:17 โ ๐ 12 ๐ 2 ๐ฌ 1 ๐ 0
MACE estimates:
1. Annotator reliability (whoโs consistent?)
2. Item difficulty (which examples spark disagreement?)
3. The most likely aggregate label (the latent โbest guessโ)
That โside projectโ ended up powering hundreds of annotation projects over the years.
3/N
20.01.2026 10:15 โ ๐ 10 ๐ 2 ๐ฌ 1 ๐ 0
However, disagreement isnโt just noiseโitโs information. It can mean an item is genuinely hardโor someone wasnโt paying attention. If only you knew whom to trustโฆ
That summer, Taylor Berg-Kirkpatrick, Ashish Vaswani, and I built MACE (Multi-Annotator Competence Estimation).
2/N
20.01.2026 10:14 โ ๐ 13 ๐ 2 ๐ฌ 1 ๐ 0
GitHub - dirkhovy/MACE: Multi-Annotator Competence Estimation tool
Multi-Annotator Competence Estimation tool. Contribute to dirkhovy/MACE development by creating an account on GitHub.
๐จ(Software) Update:
In my PhD, I had a side project to fix an annoying problem: when you ask 5 people to label the same thing, you often get different answers. But in ML (and lots of other analyses), you still need a single aggregated answer. Using the majority vote is easyโbut often wrong.
1/N
20.01.2026 10:12 โ ๐ 74 ๐ 13 ๐ฌ 6 ๐ 0
Postdoctoral Researcher โ NLP (2 positions) | MilaNLP Lab @ Bocconi University
Two Postdoctoral Researcher positions โ Deadline January 31st, 2026
New year, new job? If that is your current mantra, check the open postdoc positions with Debora Nozza and me at our lab. Deadline is January 31st.
milanlproc.github.io/open_positio...
19.01.2026 16:13 โ ๐ 11 ๐ 10 ๐ฌ 0 ๐ 1
๐ Weโre opening 2 fully funded postdoc positions in #NLP!
Join the MilaNLP team and contribute to our upcoming research projects.
๐ More details: milanlproc.github.io/open_positio...
โฐ Deadline: Jan 31, 2026
18.12.2025 15:29 โ ๐ 19 ๐ 13 ๐ฌ 0 ๐ 2
Happy to have contributed to this
23.12.2025 13:55 โ ๐ 3 ๐ 0 ๐ฌ 0 ๐ 0
Countering Hateful and Offensive Speech Online - Open Challenges
Flor Miriam Plaza-del-Arco, Debora Nozza, Marco Guerini, Jeffrey Sorensen, Marcos Zampieri. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts.โฆ
#MemoryModay #NLProc Countering Hateful and Offensive Speech Online - Open Challenges" by Plaza-Del-Arco, @debora_nozza, Guerini, Sorensen, Zampieri, 2024 is a tutorial on the challenges and solutions for detecting and mitigating hate speech.
22.12.2025 16:03 โ ๐ 4 ๐ 2 ๐ฌ 0 ๐ 0
#MemoryModay #NLProc Uma, A. N. et al. examine AI model training in 'Learning from Disagreement: A Survey'. Disagreement-handling methods' performance is shaped by evaluation methods & dataset traits.
15.12.2025 16:02 โ ๐ 4 ๐ 2 ๐ฌ 0 ๐ 0
Come work with @deboranozza.bsky.social, me, and the lab in Milan!
19.12.2025 10:58 โ ๐ 6 ๐ 3 ๐ฌ 0 ๐ 0
We don't actually trust AI.
We trust the companies behind it.
As Maria Antoniak notes, every "private" chat flows through corporate systems with long histories of data misuse. If we care about AI ethics, we need to name power, not anthropomorphize models.
15.12.2025 17:04 โ ๐ 54 ๐ 13 ๐ฌ 1 ๐ 5
Respectful or Toxic? Using Zero-Shot Learning with Language Models to Detect Hate Speech
Flor Miriam Plaza-del-arco, Debora Nozza, Dirk Hovy. The 7th Workshop on Online Abuse and Harms (WOAH). 2023.
#TBT #NLProc 'Respectful or Toxic?' by Plaza-del-Arco, @debora & @dirkhovy.bsky.social (2023) explores zero-shot learning for multilingual hate speech detection. Highlights prompt & model choice for accuracy. #AI #LanguageModels #HateSpeechDetection
11.12.2025 16:03 โ ๐ 2 ๐ 2 ๐ฌ 0 ๐ 0
#MemoryModay #NLProc 'Leveraging Social Interactions to Detect Misinformation on Social Media' by Fornaciari et al. (2023) uses combined text and network analysis to spot unreliable threads.
08.12.2025 16:03 โ ๐ 3 ๐ 2 ๐ฌ 0 ๐ 0
The Center for Information Technology Policy at Princeton invites applications for a Postdoctoral Fellow to work with Andy Guess (Politics/SPIA), Brandon Stewart (Sociology), and me (CS).
puwebp.princeton.edu/AcadHire/app...
Please apply before Sunday, the 13th of December!
09.12.2025 20:51 โ ๐ 16 ๐ 10 ๐ฌ 0 ๐ 0
Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models
Paul Rรถttger, Haitham Seelawi, Debora Nozza, Zeerak Talat, Bertie Vidgen. Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH). 2022.
#MemoryModay #NLProc 'Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models' by @paul-rottger.bsky.social et al. (2022). A suite of tests for 10 languages.
01.12.2025 16:03 โ ๐ 3 ๐ 2 ๐ฌ 0 ๐ 0
political scientist โข how people engage with info, and why this matters for attitudes and policy around the world โข Deputy Editor @migrationjrnl.bsky.social โข UK Young Academy โข all the baking โข www.wlallen.com
Assistant Professor in Political Science & Data Science at Trinity College Dublin, Director of the Applied Social Data Science (ASDS) Programme.
Former Post-Doc at QTM Emory, WUSTL PhD. UW-Madison alum & native.
www.jeffreyziegler.org
Yale SOM professor & Bulls fan. I study consumer finance, and econometrics is a big part of my research identity. He/him/his
he/they
/in/jakemannix, fka @pbrane
professionally: Tech Fellow, AI/Relevance, Walmart Global Tech
here: bad math/physics jokes, AI++, puns, OSS ML news, ultras/MTB/outdoorsy stuff, DL papers, shitposting, the fall of democracy
Abolish ICE, full stop.
Anti-cynic. Towards a weirder future. Reinforcement Learning, Autonomous Vehicles, transportation systems, the works. Asst. Prof at NYU
https://emerge-lab.github.io
https://www.admonymous.co/eugenevinitsky
Husband and dad. Senior Lecturer at Columbia University CS (NLP, AI, CS edu). Travel, music, foodie, sailing, politics, etc. Opinions are my own.
supporting researchers counting words in various ways with computers at university of arizona libraries; increasingly displaced new englander
Biomedical Informatics PhD โข CITRIS Health @UC Berkeley โข FAMIA โข Focusing on Informatics and AI in medicine โข Linfield U. Grad โข Missoula MT
https://smcgrath.phd
Living and working with First Nations people who are keeping their ancestral languages strong. He/they.
๐Larrakia, Bininj, and Miriwoong country
http://linktr.ee/stevenbird
Associate Professor @ UBC
computational sociology
machine learning is feminist
You only have to look at the Medusa straight on to see her. And sheโs not deadly. Sheโs beautiful and sheโs laughing.
www.lauraknelson.com
Junior Faculty at the University of Mannheim || Computational Social Science โฉ Natural Language Processing || Formerly at: RWTH, GESIS || she/her
indiiigo.github.io/
Senior Researcher @gesis.org // Data Editor @polcommjournal.bsky.social
๐ political communication (#polsky + #commsky) with text analysis and #rstats (#opendata + #openscience)
๐ JohannesBGruber.eu
๐จโ๐ป research software github.com/JBGruber
PhD student at the IMS (Uni Stuttgart)
PhD student @ University of Vienna | Role-playing LLMs, personalization & competing goal alignment | Cats, games & pop(culture|corn)
AI researcher Google DeepMind * hon. professor at Heriot-Watt University * mother of dragons * Own opinions only.
CSS Postdoc @ Northwestern University
NLP for Violence Research & Mental Health / Misinformation in Science
๐ https://miriamschirmer.github.io
PhD student in social and political science
Bocconi University, Milano