Indira Sen's Avatar

Indira Sen

@indiiigo.bsky.social

Junior Faculty at the University of Mannheim || Computational Social Science ∩ Natural Language Processing || Formerly at: RWTH, GESIS || she/her indiiigo.github.io/

582 Followers  |  412 Following  |  37 Posts  |  Joined: 23.09.2023  |  1.8605

Latest posts by indiiigo.bsky.social on Bluesky

Post image Post image Post image Post image

Lots of great posters at the #wiknlp workshop at #ACL2025NLP

01.08.2025 13:54 — 👍 0    🔁 0    💬 0    📌 0
Post image Post image

Great keynote by Matthias Gallé on multilinguality in LLMs with takeaways on how we have to go broader *and* deeper to achieve multilingual efficacy by covering local knowledge.

Struck by the industrialization of LLM research with LLM tech reports now having massive # authors. #wikinlp #acl2025nlp

01.08.2025 09:47 — 👍 0    🔁 0    💬 1    📌 0
Post image

Time for our second keynote 🚨

@fvancesco.bsky.social is going to guide us through practical aspects of safety that are often overlooked in academia.

Do we want to close the gap between academia and industry? Join us to find out!

#ACL2025NLP

01.08.2025 09:08 — 👍 5    🔁 1    💬 0    📌 1
Post image

Excellent panel on dataset papers using Wikipedia data and the importance and challenges of multilingual research.

Check out the dataset paper’s here: meta.m.wikimedia.org/wiki/NLP_for...

01.08.2025 09:15 — 👍 0    🔁 0    💬 1    📌 0
Post image

Incredible keynote by Monica Lam on creating LLM-powered research assistants.

One great example of NLP/wikipedia synergy is this tool that helps find inconsistencies in Wikipedia articles and fix them semi-automatically: wikifix.genie.stanford.edu

01.08.2025 08:06 — 👍 2    🔁 1    💬 2    📌 0
WikiNLP opening session

WikiNLP opening session

On the interplay between Wikipedia and NLP

On the interplay between Wikipedia and NLP

Happening now!

01.08.2025 07:07 — 👍 3    🔁 0    💬 1    📌 0
WikiNLP workshop program with keynotes, dataset panel, poster session, discussions with Wikipedia editors and more.

WikiNLP workshop program with keynotes, dataset panel, poster session, discussions with Wikipedia editors and more.

Last day of #ACL2025NLP but there's still lots to do: attend the #WikiNLP workshop, where we explore how NLP and wikipedia can help each other!

We have amazing keynotes, discussions with Wikipedia editors, a panel + posters!

Details: meta.wikimedia.org/wiki/NLP_for...

Join us in room 2.31!

01.08.2025 05:43 — 👍 19    🔁 3    💬 1    📌 0
Preview
A Comparative Approach for Auditing Multilingual Phonetic Transcript Archives Abstract. Curating datasets that span multiple languages is challenging. To make the collection more scalable, researchers often incorporate one or more imperfect classifiers in the process, like lang...

Presenting a TACL paper on strong limitations of universal speech recognition models and datasets, at #ACL2025, on *Wed. 11-12:30*. Pls come hear me out on how speech, as a hugely varying cultural practice, inherently resists the sort of large-scale datafication that's needed for machine learning

29.07.2025 14:32 — 👍 7    🔁 4    💬 0    📌 0

Hire Agostina! She does lots of great work in CSS+NLP

29.07.2025 11:21 — 👍 2    🔁 1    💬 0    📌 0

It’s poster board 1! The only CSS poster in this poster session!!

29.07.2025 09:23 — 👍 3    🔁 0    💬 0    📌 0

👋 #ACL2025NLP 🇦🇹 @marlutz.bsky.social and I are presenting our poster on demographic representativeness of LLMs today!

🕦 10:30-12:00
📍 Hall X5 (board 1 or 14 according to different sources 🧐)

Here’s the paper on ACL anthology: aclanthology.org/2025.finding...

Drop by!

29.07.2025 07:31 — 👍 19    🔁 7    💬 0    📌 1
Post image

Very excited about all these papers on sociotechnical alignment & the societal impacts of AI at #ACL2025.

As is now tradition, I made some timetables to help me find my way around. Sharing here in case others find them useful too :) 🧵

28.07.2025 06:12 — 👍 26    🔁 6    💬 1    📌 0

I'm at #ACL2025NLP this week in Vienna, and organizing a BoF session on Teaching NLP with Margot Mieskes! This is an informal session to bring together attendees interested in discussing current challenges and opportunities in teaching natural language processing.

27.07.2025 14:56 — 👍 1    🔁 1    💬 1    📌 0

The #ACL2025 #ACL2025NLP feed is up and running! It matches both hashtags and any posts from or mentions of @aclmeeting.bsky.social

Pin it to your home 📌 and enjoy!

bsky.app/profile/did:...

17.07.2025 11:15 — 👍 48    🔁 14    💬 2    📌 0

Made a tiny pit stop between conferences to teach at the summer school for women* in #polmeth [http://summerschoolwpm.org/]. And I'm glad, because I got to meet all the talented participants! Thanks to @sophiahunger.bsky.social and @gessler.bsky.social for organizing and for having me!

26.07.2025 12:19 — 👍 17    🔁 1    💬 0    📌 0
Classroom full of participants

Classroom full of participants

Fantastic final full day workshop on Friday with @indiiigo.bsky.social on LLMs & social science with hands on tutorials and lots of input on the theoretical & ethical aspects

26.07.2025 10:05 — 👍 6    🔁 1    💬 0    📌 1

_Estimands approach
_Macro-micro-macro model
_RCT, Bounded Rationality, dual-process models, game theory basics
_Philosophy of science & epistemology basics
_Causal inference vs descriptive inference
_Total Survey Error (TSE) & TED-on
_Read substantive literature whenever you do substantive research

24.07.2025 11:53 — 👍 8    🔁 1    💬 0    📌 0

No worries, see you at ACL!

25.07.2025 16:30 — 👍 1    🔁 0    💬 0    📌 0

Cool list! We have one at ACL findings, led by @alpsha.bsky.social : arxiv.org/pdf/2411.08977 (poster session 12 on Wednesday)

We looked at LLM-human annotator alignment across multiple offensive language datasets to assess if demographic patterns held when LLMs aren’t steered with a persona.

24.07.2025 18:10 — 👍 1    🔁 0    💬 1    📌 0
Post image

Chair for Data Science in the Economic and Social Sciences at University of Mannheim having lots of fun at #ic2s2 @janajung.bsky.social @wanlo.bsky.social @indiiigo.bsky.social @jrupprec.bsky.social @maximiliankreutner.bsky.social and Stefano Balietti

23.07.2025 15:22 — 👍 21    🔁 4    💬 0    📌 0

What are your favorite recent papers on using LMs for annotation (especially in a loop with human annotators), synthetic data for task-specific prediction, active learning, and similar?

Looking for practical methods for settings where human annotations are costly.

A few examples in thread ↴

23.07.2025 08:10 — 👍 74    🔁 23    💬 14    📌 3

Before heading to ACL, I'm excited to be at #IC2S2 this week! 🌞

I'll present a related working paper on validating LLM social simulations at the ABM session on Tuesday (11 AM, Vingen 7): indiiigo.github.io/files/GABM_V...

(w/ @wanlo.bsky.social @mstrohm.bsky.social and @janalasser.bsky.social)

21.07.2025 20:00 — 👍 17    🔁 6    💬 0    📌 0
Preview
GitHub - Indiiigo/LLM_rep_review: Systematic Review of the Demographic Representativeness of LLMs Systematic Review of the Demographic Representativeness of LLMs - Indiiigo/LLM_rep_review

Joint work w/ @marlutz.bsky.social, Elisa Rogers, @dgarcia.eu and @mstrohm.bsky.social

You can find our code and annotated dataset of papers here: github.com/Indiiigo/LLM...

We annotated way more things, e.g., LLM used, response format, so please check it out!

5/5

21.07.2025 10:11 — 👍 2    🔁 2    💬 0    📌 0

Many papers that conclude positively about LLM representativeness, don't report demographic subgroups or don't report LLM results for different demographic groups and subgroups.

The findngs point to an inflated perception of LLM efficacy and call for better reporting of demographic data.

4/

21.07.2025 10:11 — 👍 0    🔁 0    💬 1    📌 0
Summary of findings.

Summary of findings.

The current literature is divided about LLM representativeness: 29% say yes, 31% say no. We find the usual issues that plague CSS/NLP literature: outsized focus on the U.S, marginalized groups excluded from the analysis.

We find these issues are worse in papers that say LLMs are representative.

3/

21.07.2025 10:11 — 👍 1    🔁 0    💬 1    📌 0
graph showing the distribution of most studied demographic categories.

graph showing the distribution of most studied demographic categories.

We annotate different aspects of 211 papers including:
- usage context (e.g content analysis, advice)
- steering method (prompting, RLHF)
- demographics studied
- whether the paper reported LLM results across demographic groups & subgroups
- the paper's conclusion on LLM representativeness

2/

21.07.2025 10:11 — 👍 1    🔁 0    💬 1    📌 0
Screenshot of our paper "Missing the Margins: A Systematic Literature Review on the Demographic Representativeness of LLMs"

Screenshot of our paper "Missing the Margins: A Systematic Literature Review on the Demographic Representativeness of LLMs"

Details about what we annotated in our systematic review

Details about what we annotated in our systematic review

Do LLMs represent the people they're supposed simulate or provide personalized assistance to?

We review the current literature in our #ACL2025 Findings paper and investigating what researchers conclude about the demographic representativeness of LLMs:
osf.io/preprints/so...

1/

21.07.2025 10:11 — 👍 23    🔁 8    💬 1    📌 2
Post image

Thinking about using #CSS methods to study #racism, #stereotypes or #hate speech in text? 📐

👉 Check out my first dissertation paper co-authored by @fabiennelind.bsky.social and @hajoboo.bsky.social just published in Annals of the ICA! @icahdq.bsky.social 🥳

🔗 doi.org/10.1093/annc...

21.07.2025 06:43 — 👍 85    🔁 36    💬 2    📌 1
Post image

Wanna do some authorship attribution? Chances are what tokenizer you use matters.

Tokenization is Sensitive to Language Variation, probably, more investigation necessary...

📄 ACL Findings paper: arxiv.org/pdf/2502.15343
🧑‍🏫 @dongng.bsky.social @davidjurgens.bsky.social and myself

See you at ACL!

17.07.2025 07:59 — 👍 14    🔁 4    💬 0    📌 1

Really excited to also present this work at #IC2S2 next week in Norrköping! 🎉 I'd love to discuss how to produce LLM survey responses at my poster on Wed at 13:30 (Poster Session 2, Poster ID 68) 📊

18.07.2025 15:19 — 👍 17    🔁 6    💬 0    📌 0

@indiiigo is following 20 prominent accounts