Indira Sen @indiiigo - Bluesky Profile

Wikipedia Says AI Is Causing a Dangerous Decline in Human Visitors “With fewer visits to Wikipedia, fewer volunteers may grow and enrich the content, and fewer individual donors may support this work.”

Wikipedia is seeing a significant decline in human traffic because more people are getting the information that’s on Wikipedia via generative AI chatbots that were trained on its articles and search engines that summarize them without actually clicking to the site

www.404media.co/wikipedia-sa...

17.10.2025 12:45 — 👍 848 🔁 360 💬 28 📌 66

One postdoctoral Research Position Deadline: November 15th, 2025

Join us as postdoc at the Inequality Discourse Observatory at the University of Konstanz: stellen.uni-konstanz.de/jobposting/7...
We will do epic research between Linguistics and Computational Social Science at the Cluster of Politics of Inequality. Feel free to DM if you have any questions.

13.10.2025 15:06 — 👍 15 🔁 15 💬 0 📌 0

Come join next Wednesday if you want to rant about society's love-hate relationship with LLMs!

16.10.2025 09:32 — 👍 9 🔁 5 💬 0 📌 0

🚨 Are you looking for a PhD in #NLProc dealing with #LLMs?
🎉 Good news: I am hiring! 🎉
The position is part of the “Contested Climate Futures" project. 🌱🌍 You will focus on developing next-generation AI methods🤖 to analyze climate-related concepts in content—including texts, images, and videos.

24.09.2025 07:34 — 👍 22 🔁 14 💬 1 📌 0

We are hiring multiple PhD and postdocs for two newly funded projects at the intersection of mental health and political polarization at the CS Dept at Aalto, Finland. The PIs are Juhi Kulshrestha, Talayeh Aledavood, and Mikko Kivelä.

Full call text and link to apply: www.aalto.fi/en/open-posi...

17.09.2025 10:22 — 👍 8 🔁 5 💬 0 📌 1

$We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation". We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks. For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations. Then, we collect 13 million LLM annotations across plausible LLM configurations. These annotations feed into 1.4 million regressions testing the hypotheses. For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions. Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors. Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models. Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.$

We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation". We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks. For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations. Then, we collect 13 million LLM annotations across plausible LLM configurations. These annotations feed into 1.4 million regressions testing the hypotheses. For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions. Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors. Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models. Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.

🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.

Paper: arxiv.org/pdf/2509.08825

12.09.2025 10:33 — 👍 263 🔁 96 💬 6 📌 20

How can an imitative model like an LLM outperform the experts it is trained on? Our new COLM paper outlines three types of transcendence and shows that each one relies on a different aspect of data diversity. arxiv.org/abs/2508.17669

29.08.2025 21:45 — 👍 95 🔁 17 💬 3 📌 5

Come join and organise the workshop with us!

25.08.2025 13:17 — 👍 2 🔁 1 💬 0 📌 0

Multi3Hate: Multimodal, Multilingual, and Multicultural Hate Speech Detection with Vision–Language Models Minh Duc Bui, Katharina Von Der Wense, Anne Lauscher. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technol...

maybe @a-lauscher.bsky.social's "Multi3hate: Multimodal, multilingual, and multicultural hate speech detection with vision-language models" aclanthology.org/2025.naacl-l...

20.08.2025 21:59 — 👍 2 🔁 0 💬 0 📌 0

ArgMining 2026 Workshop Organising Committee Application

If you want to nominate yourself to be the organizer of the next Argument Mining workshop @argminingorg.bsky.social‬, fill in this form: docs.google.com/forms/d/e/1F... Deadline: Aug 22nd 13.00 CEST!

20.08.2025 12:14 — 👍 1 🔁 1 💬 0 📌 0

Moral Foundation Measurements Fail to Converge on Multilingual Party Manifestos | Political Analysis | Cambridge Core Moral Foundation Measurements Fail to Converge on Multilingual Party Manifestos

New publication, out in Political Analysis:

There is an increasing array of tools to measure facets of morality in political language. But while they ostensibly measure the same concept, do they actually?

I and @fhopp.bsky.social set out to see what happens.

19.08.2025 07:52 — 👍 34 🔁 14 💬 3 📌 1

Call for Volunteers Official website for the 2025 Conference on Empirical Methods in Natural Language Processing

The Call for #EMNLP2025 @emnlpmeeting.bsky.social student volunteers is out:
2025.emnlp.org/calls/volunt...
Please fill out the form by 20 Sep 2026 : forms.gle/qfTkVGyDitXi...
For questions, you can contact emnlp2025-student-volunteer-chairs [at] googlegroups [dot] com

15.08.2025 16:40 — 👍 3 🔁 4 💬 0 📌 0

Work with @giadapistilli.com and @yjernite.bsky.social

📄 Full Paper: huggingface.co/datasets/AI-...
🔍 Explore INTIMA: huggingface.co/datasets/AI-...

11.08.2025 08:12 — 👍 4 🔁 2 💬 0 📌 0

Identity-Aware AI Workshop announcement. Co-located with ECAI 2025 in Bologna on October 25, with submission deadline August 22. Topics include: Methods for effective, fair, and inclusive AI; Critiques of AI on the exclusion of identities; Methods for detecting and controlling bias; Perspectivist approaches to AI. Submission types: Long papers (8 pages), Short papers (4 pages), Extended abstracts, Mixed-media submissions (videos, blogs, codebase, artworks). For details, visit: identity-aware-ai.github.io

Wondering what makes each of us unique and how AI should handle human diversity? 🤔

We're organizing Identity-Aware AI workshop at #ECAI2025 Bologna on Oct 25.

Deadline: Aug 22
Website: identity-aware-ai.github.io

29.07.2025 13:00 — 👍 5 🔁 3 💬 1 📌 1

Volunteers fight to keep ‘AI slop’ off Wikipedia Hundreds of Wikipedia articles may contain AI-generated errors. Editors are working around the clock to stamp them out.

Wikipedia has long been one of my favourite places online. As AI becomes part of knowledge creation, there's a lot we can learn from its editor communities. I spoke with Daniel Wu about AI content on Wikipedia; some thoughts made it into this piece:
www.washingtonpost.com/technology/2...

08.08.2025 15:17 — 👍 8 🔁 3 💬 0 📌 0

Exploring Public Attitudes Toward Generative AI for News Across Four Countries | Journal of Quantitative Description: Digital Media

What do people in 🇨🇭🇩🇪🇯🇵🇺🇸 think about GenAI for news-related purposes?

We find the adoption of GenAI for news and trust in the journalistic deployment of GenAI are relatively low, and so is knowledge regarding GenAI.

Read more in this new paper led by Eliza Mitova! journalqd.org/article/view...

07.08.2025 07:16 — 👍 8 🔁 3 💬 1 📌 0

Can We Fix Social Media? Testing Prosocial Interventions using Generative Social Simulation Social media platforms have been widely linked to societal harms, including rising polarization and the erosion of constructive debate. Can these problems be mitigated through prosocial interventions?...

We built the simplest possible social media platform. No algorithms. No ads. Just LLM agents posting and following.

It still became a polarization machine.

Then we tried six interventions to fix social media.

The results were… not what we expected.

arxiv.org/abs/2508.03385

06.08.2025 08:24 — 👍 287 🔁 99 💬 13 📌 42

Annie and Lakeesha struggle in school. AI teacher assistants treated them very differently. A Common Sense Media study found that prominent teacher assistants that use AI generated recommendations that appeared to be rooted in racial stereotypes based on students’ names. About a third of tea...

"Asked to generate intervention plans for struggling students, AI teacher assistants recommended more-punitive measures for hypothetical students with Black-coded names and more supportive approaches for students the platforms perceived as white" www.chalkbeat.org/2025/08/06/a...

06.08.2025 19:25 — 👍 547 🔁 315 💬 9 📌 68

Lots of great posters at the #wiknlp workshop at #ACL2025NLP

01.08.2025 13:54 — 👍 0 🔁 0 💬 0 📌 0

Great keynote by Matthias Gallé on multilinguality in LLMs with takeaways on how we have to go broader *and* deeper to achieve multilingual efficacy by covering local knowledge.

Struck by the industrialization of LLM research with LLM tech reports now having massive # authors. #wikinlp #acl2025nlp

01.08.2025 09:47 — 👍 0 🔁 0 💬 1 📌 0

Time for our second keynote 🚨

@fvancesco.bsky.social is going to guide us through practical aspects of safety that are often overlooked in academia.

Do we want to close the gap between academia and industry? Join us to find out!

#ACL2025NLP

01.08.2025 09:08 — 👍 5 🔁 1 💬 0 📌 1

Excellent panel on dataset papers using Wikipedia data and the importance and challenges of multilingual research.

Check out the dataset paper’s here: meta.m.wikimedia.org/wiki/NLP_for...

01.08.2025 09:15 — 👍 0 🔁 0 💬 1 📌 0

Incredible keynote by Monica Lam on creating LLM-powered research assistants.

One great example of NLP/wikipedia synergy is this tool that helps find inconsistencies in Wikipedia articles and fix them semi-automatically: wikifix.genie.stanford.edu

01.08.2025 08:06 — 👍 2 🔁 1 💬 2 📌 0

WikiNLP opening session

On the interplay between Wikipedia and NLP

Happening now!

01.08.2025 07:07 — 👍 3 🔁 0 💬 1 📌 0

WikiNLP workshop program with keynotes, dataset panel, poster session, discussions with Wikipedia editors and more.

Last day of #ACL2025NLP but there's still lots to do: attend the #WikiNLP workshop, where we explore how NLP and wikipedia can help each other!

We have amazing keynotes, discussions with Wikipedia editors, a panel + posters!

Details: meta.wikimedia.org/wiki/NLP_for...

Join us in room 2.31!

01.08.2025 05:43 — 👍 20 🔁 3 💬 1 📌 0

Hire Agostina! She does lots of great work in CSS+NLP

29.07.2025 11:21 — 👍 2 🔁 1 💬 0 📌 0

It’s poster board 1! The only CSS poster in this poster session!!

29.07.2025 09:23 — 👍 3 🔁 0 💬 0 📌 0

👋 #ACL2025NLP 🇦🇹 @marlutz.bsky.social and I are presenting our poster on demographic representativeness of LLMs today!

🕦 10:30-12:00
📍 Hall X5 (board 1 or 14 according to different sources 🧐)

Here’s the paper on ACL anthology: aclanthology.org/2025.finding...

Drop by!

29.07.2025 07:31 — 👍 22 🔁 7 💬 0 📌 1

Very excited about all these papers on sociotechnical alignment & the societal impacts of AI at #ACL2025.

As is now tradition, I made some timetables to help me find my way around. Sharing here in case others find them useful too :) 🧵

28.07.2025 06:12 — 👍 26 🔁 6 💬 1 📌 0

Indira Sen

Latest posts by indiiigo.bsky.social on Bluesky

@indiiigo is following 20 prominent accounts