Shivani Kumar's Avatar

Shivani Kumar

@shivanikumar.bsky.social

Postdoc @ University of Michigan | PhD from LCS2, IIITDelhi | Working in Computational Social Science #NLProc More info: kumarshivani.com

22 Followers  |  25 Following  |  15 Posts  |  Joined: 20.02.2025  |  1.7129

Latest posts by shivanikumar.bsky.social on Bluesky

Morality in AI is often oversimplified. @davidjurgens.bsky.social and @shivanikumar.bsky.social kick off the "Human-Centred NLP" orals #ACL2025NLP with UniMoral, a huge dataset of moral scenario ratings in 6 languages! They find LLMs fail to simulated human moral decisions. bsky.app/profile/shiv...

30.07.2025 07:14 โ€” ๐Ÿ‘ 10    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Work done at #UMSI with the amazing @davidjurgens.bsky.social! Read more in our preprint! ๐Ÿ”—
๐Ÿ“„ Paper: arxiv.org/abs/2502.14083
๐Ÿ“‚ Dataset: huggingface.co/datasets/shi...

@umichresearch.bsky.social #umichresearch #umich
(n/n)

01.03.2025 00:56 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

๐Ÿ Final verdict? Across languages & contexts, models struggle to exceed chance in moral reasoning, highlighting gaps, especially in data-scarce languages.
UniMoral supports studies on cross-cultural moral generalization, bias detection, & value quantification to enhance ethics in AI! (8/n)

01.03.2025 00:56 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Are models better at psychological vs. real-world dilemmas?

๐Ÿ‘ Yes, models perform better on psychological scenarios than Reddit dilemmas.
The gap is larger in predicting ethics & decision factors.
Why? Structured scenarios align with values, while Reddit dilemmas add noise and ambiguity. (7/n)

01.03.2025 00:56 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Do the responder's values improve predictions?

๐Ÿ‘ Yes, context matters!
Values aid action prediction, but models rely on surface patterns. Surprisingly, a short self-authored persona works as well as values in personalizing predictions. Examples also help in identifying decision factors. (6/n)

01.03.2025 00:56 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Can models reason equally well in different languages?

๐Ÿ‘Ž No! Moral reasoning varies.
English, Spanish & Russian outperform. Arabic & Hindi show lower confidence due to limited data & complex morphology.
โž• Identifying decision factors lags behind action prediction. (5/n)

01.03.2025 00:56 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Can AI reason morally?

We tested LLMs with UniMoral to:
โš–๏ธ Make action choices
๐Ÿ›๏ธ Identify ethical preferences
โœ… Recognize influences
๐Ÿ”ฎ Predict consequences
Insights: LLMs excel at action & consequence but lag in ethics & factors. But, how well do they generalize across languages and contexts? (4/n)

01.03.2025 00:56 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

Whatโ€™s inside?

๐Ÿ’ญ Multilingual Hypothetical + Reddit based dilemmas
๐ŸŒ Action choices of people across 46 countries!
๐Ÿ”Ž Ethical principles preferences
๐Ÿ“Š Cultural & moral profiles of annotators
๐Ÿ” Consequence modeling
Think of it as a "CT scan" of human moral judgment! (3/n)

01.03.2025 00:56 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Why care?๐Ÿค”

AI thrives on decision-making, yet most NLP research in moral reasoning relies on fragmented, western-centric data. Whatโ€™s missing? A dataset capturing the full cycle: actions โš–๏ธ, ethics ๐Ÿ›๏ธ, consequences ๐Ÿ”„, and cultural nuance ๐ŸŒ.
Thatโ€™s where UniMoral comes in. (2/n)

01.03.2025 00:56 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Can AI grasp how humans across cultures reason through moral dilemmas?

โœจMeet UniMoral-a unique multilingual dataset merging psychology & NLP to model moral reasoning as a pipeline. It enables LLMs to reason about decisions and their ethical implications across languages.
Thread๐Ÿงต(1/n)

01.03.2025 00:56 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Post image

Are models better at psychological vs. real-world dilemmas?

๐Ÿ‘ Yes, models perform better on psychological scenarios than Reddit dilemmas.
The gap is larger in predicting ethics & decision factors.
Why? Structured scenarios align with values, while Reddit dilemmas add noise and ambiguity. (7/n)

01.03.2025 00:43 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Do the responder's values improve predictions?

๐Ÿ‘ Yes, context matters!
Values aid action prediction, but models rely on surface patterns. Surprisingly, a short self-authored persona works as well as values in personalizing predictions. Examples also help in identifying decision factors. (6/n)

01.03.2025 00:43 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Can models reason equally well in different languages?

๐Ÿ‘Ž No! Moral reasoning varies.
English, Spanish & Russian outperform. Arabic & Hindi show lower confidence due to limited data & complex morphology.
โž• Identifying decision factors lags behind action prediction. (5/n)

01.03.2025 00:43 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Can AI reason morally?

We tested LLMs with UniMoral to:
โš–๏ธ Make action choices
๐Ÿ›๏ธ Identify ethical preferences
โœ… Recognize influences
๐Ÿ”ฎ Predict consequences
Insights: LLMs excel at action & consequence but lag in ethics & factors. But, how well do they generalize across languages and contexts? (4/n)

01.03.2025 00:43 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

Whatโ€™s inside?

๐Ÿ’ญ Multilingual Hypothetical + Reddit based dilemmas
๐ŸŒ Action choices of people across 46 countries!
๐Ÿ”Ž Ethical principles preferences
๐Ÿ“Š Cultural & moral profiles of annotators
๐Ÿ” Consequence modeling
Think of it as a "CT scan" of human moral judgment! (3/n)

01.03.2025 00:43 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Why care?๐Ÿค”

AI thrives on decision-making, yet most NLP research in moral reasoning relies on fragmented, western-centric data. Whatโ€™s missing? A dataset capturing the full cycle: actions โš–๏ธ, ethics ๐Ÿ›๏ธ, consequences ๐Ÿ”„, and cultural nuance ๐ŸŒ.
Thatโ€™s where UniMoral comes in. (2/n)

01.03.2025 00:43 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@shivanikumar is following 20 prominent accounts