Morality in AI is often oversimplified. @davidjurgens.bsky.social and @shivanikumar.bsky.social kick off the "Human-Centred NLP" orals #ACL2025NLP with UniMoral, a huge dataset of moral scenario ratings in 6 languages! They find LLMs fail to simulated human moral decisions. bsky.app/profile/shiv...
30.07.2025 07:14 โ ๐ 10 ๐ 2 ๐ฌ 1 ๐ 0
Work done at #UMSI with the amazing @davidjurgens.bsky.social! Read more in our preprint! ๐
๐ Paper: arxiv.org/abs/2502.14083
๐ Dataset: huggingface.co/datasets/shi...
@umichresearch.bsky.social #umichresearch #umich
(n/n)
01.03.2025 00:56 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 1
๐ Final verdict? Across languages & contexts, models struggle to exceed chance in moral reasoning, highlighting gaps, especially in data-scarce languages.
UniMoral supports studies on cross-cultural moral generalization, bias detection, & value quantification to enhance ethics in AI! (8/n)
01.03.2025 00:56 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
Are models better at psychological vs. real-world dilemmas?
๐ Yes, models perform better on psychological scenarios than Reddit dilemmas.
The gap is larger in predicting ethics & decision factors.
Why? Structured scenarios align with values, while Reddit dilemmas add noise and ambiguity. (7/n)
01.03.2025 00:56 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Do the responder's values improve predictions?
๐ Yes, context matters!
Values aid action prediction, but models rely on surface patterns. Surprisingly, a short self-authored persona works as well as values in personalizing predictions. Examples also help in identifying decision factors. (6/n)
01.03.2025 00:56 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Can models reason equally well in different languages?
๐ No! Moral reasoning varies.
English, Spanish & Russian outperform. Arabic & Hindi show lower confidence due to limited data & complex morphology.
โ Identifying decision factors lags behind action prediction. (5/n)
01.03.2025 00:56 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Can AI reason morally?
We tested LLMs with UniMoral to:
โ๏ธ Make action choices
๐๏ธ Identify ethical preferences
โ
Recognize influences
๐ฎ Predict consequences
Insights: LLMs excel at action & consequence but lag in ethics & factors. But, how well do they generalize across languages and contexts? (4/n)
01.03.2025 00:56 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
Whatโs inside?
๐ญ Multilingual Hypothetical + Reddit based dilemmas
๐ Action choices of people across 46 countries!
๐ Ethical principles preferences
๐ Cultural & moral profiles of annotators
๐ Consequence modeling
Think of it as a "CT scan" of human moral judgment! (3/n)
01.03.2025 00:56 โ ๐ 1 ๐ 1 ๐ฌ 1 ๐ 0
Why care?๐ค
AI thrives on decision-making, yet most NLP research in moral reasoning relies on fragmented, western-centric data. Whatโs missing? A dataset capturing the full cycle: actions โ๏ธ, ethics ๐๏ธ, consequences ๐, and cultural nuance ๐.
Thatโs where UniMoral comes in. (2/n)
01.03.2025 00:56 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Can AI grasp how humans across cultures reason through moral dilemmas?
โจMeet UniMoral-a unique multilingual dataset merging psychology & NLP to model moral reasoning as a pipeline. It enables LLMs to reason about decisions and their ethical implications across languages.
Thread๐งต(1/n)
01.03.2025 00:56 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 1
Are models better at psychological vs. real-world dilemmas?
๐ Yes, models perform better on psychological scenarios than Reddit dilemmas.
The gap is larger in predicting ethics & decision factors.
Why? Structured scenarios align with values, while Reddit dilemmas add noise and ambiguity. (7/n)
01.03.2025 00:43 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Do the responder's values improve predictions?
๐ Yes, context matters!
Values aid action prediction, but models rely on surface patterns. Surprisingly, a short self-authored persona works as well as values in personalizing predictions. Examples also help in identifying decision factors. (6/n)
01.03.2025 00:43 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Can models reason equally well in different languages?
๐ No! Moral reasoning varies.
English, Spanish & Russian outperform. Arabic & Hindi show lower confidence due to limited data & complex morphology.
โ Identifying decision factors lags behind action prediction. (5/n)
01.03.2025 00:43 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Can AI reason morally?
We tested LLMs with UniMoral to:
โ๏ธ Make action choices
๐๏ธ Identify ethical preferences
โ
Recognize influences
๐ฎ Predict consequences
Insights: LLMs excel at action & consequence but lag in ethics & factors. But, how well do they generalize across languages and contexts? (4/n)
01.03.2025 00:43 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
Whatโs inside?
๐ญ Multilingual Hypothetical + Reddit based dilemmas
๐ Action choices of people across 46 countries!
๐ Ethical principles preferences
๐ Cultural & moral profiles of annotators
๐ Consequence modeling
Think of it as a "CT scan" of human moral judgment! (3/n)
01.03.2025 00:43 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Why care?๐ค
AI thrives on decision-making, yet most NLP research in moral reasoning relies on fragmented, western-centric data. Whatโs missing? A dataset capturing the full cycle: actions โ๏ธ, ethics ๐๏ธ, consequences ๐, and cultural nuance ๐.
Thatโs where UniMoral comes in. (2/n)
01.03.2025 00:43 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
PhDing at LTI, CMU
Prev: Ai2, Google Research, MSR
Evaluating language technologies, regularly ranting, and probably procrastinating.
https://sites.google.com/view/shailybhatt/
Incoming CS PhD @UIUC | MSc @UMich | BEng @SJTU.
Interested in #NLProc & #AI.
leczhang.com
Assistant professor at https://si.umich.edu/ working in computational social science, machine learning, and NLP | https://dallascard.github.io
Associate professor at IT University of Copenhagen: NLP, language models, interpretability, AI & society. Co-editor-in-chief of ACL Rolling Review. #NLProc #NLP
Just a passionate dev, learning from this community daily.
โจ Sharing the entire journey - bugs, breakthroughs, and banter. ๐
PhD at Telecom SudParis, Institut Polytechnique de Paris.
I research content moderation methods in low-resource code mixed languages and literary linguistic phenomena in digital humanities (CSS).
Website: callmesanfornow.github.io
Assistant Professor at Bocconi University in MilaNLP group โข Working in #NLP, #HateSpeech and #Ethics โข She/her โข #ERCStG PERSONAE
Computer Science PhD student at Bielefeld University -
NLProc and Computational Social Science -
Disagreement, Human Label Variation, Perspectives -
Website: https://orlikow.ski
The Association for Computational Linguistics (ACL) is a scientific and professional organization for people working on Natural Language Processing/Computational Linguistics.
Hash tags: #NLProc #ACL2025NLP
Fueling NLP with Passion! Research in #NLP led by Dr. Shad Akhtar @ IIIT Delhi, India
Associate Professor at UMSI, UMICHCS, and UMICHCSE working on Computational Social Science, Network Science, Science of Science, Complex Systems, and Social Media. ๐จ๐ด๐บ๐ธ dromero.org
Computer Science -- Computation and Language
source: export.arxiv.org/rss/cs.CL
maintainer: @tmaehara.bsky.social
Chair Prof in AI, Associate Prof @iitdelhi; ACM Distinguished Speaker; Lab @lcs2lab; Previously @IIITDelhi @UofMaryland @iitkgp; #NLP #SocialComputing
EMNLP 2025 - The annual Conference on Empirical Methods in Natural Language Processing
Dates: November 5-9, 2025 in Suzhou, China
Hashtags: #EMNLP2025 #NLP
Submission Deadline: May 19th, 2025
Computational LinguistsโNatural LanguageโMachine Learning
The AI community building the future!
Assistant Professor @Stanford CS @StanfordNLP @StanfordAILab
Computational Social Science & NLP
Associate prof at @UMich in SI and CSE working in computational social science and natural language processing. PI of the Blablablab blablablab.si.umich.edu