Michael A. Hedderich's Avatar

Michael A. Hedderich

@mhedderich.bsky.social

Research group leader at LMU Munich and MCML on ML, NLP & HCI. Also experimenting with lemonade that glows in the dark πŸ₯€ (he/him)

96 Followers  |  78 Following  |  16 Posts  |  Joined: 09.12.2024  |  2.274

Latest posts by mhedderich.bsky.social on Bluesky

MCML just started again a call for their very competitive but also really nice, fully funded PhD positions.

These positions are matched to research groups at both TUM and LMU, including my group and the other great ML and NLP groups here in Munich πŸ˜„

10.10.2025 11:14 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Charting the Landscape of African NLP: Mapping Progress and Shaping the Road Ahead With over 2,000 languages and potentially millions of speakers, Africa represents one of the richest linguistic regions in the world. Yet, this diversity is scarcely reflected in state-of-the-art natu...

Check out our survey at #EMNLP2025 and help build a future where low-resource languages including African languages are represented in NLP!

Paper: arxiv.org/abs/2505.21315

This is work lead (in a great way) by Jesujoba Alabi and together with David Adelani and Dietrich Klakow.

02.10.2025 20:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Based on the analysis, we suggest future directions including:
1️⃣ Scale beyond the top-10 high-resource languages
2️⃣ Build more multicultural, native-language datasets
3️⃣ Develop African-centric LLMs
4️⃣ Focus on human-centered, application-driven NLP

02.10.2025 20:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Key findings include:
1️⃣ Papers have increased rapidly in the last 5 years πŸ“ˆ
2️⃣ Research is skewed toward certain tasks like MT and NLU
3️⃣ Language coverage is uneven, with a few languages dominating

02.10.2025 20:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We cover datasets, tasks, methods, and themes across 25+ venues (NLP, speech, HCI, ML), and manually analyzed 884 papers for this survey.

02.10.2025 20:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We have 3 main goals:
1️⃣ Comprehensive Overview – Map the research landscape
2️⃣ Accessible Entry Point – Easy starting point for new researchers
3️⃣ Open Issues – Highlight gaps and challenges

02.10.2025 20:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Despite resource gaps, NLP research on African languages is far from dormant. Growth is fueled by community initiatives, multilingual large corpora, shared tasks, and dedicated venues, making this a great time to chart the field.

02.10.2025 20:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
NLP research distribution across Africa by
language coverage.

NLP research distribution across Africa by language coverage.

Excited to share that our survey paper "Charting the Landscape of African NLP: Mapping Progress and Shaping the Road Ahead" lead by Jesujoba Alabi has been accepted at #EMNLP2025! Here’s a short 🧡 about the paper.

02.10.2025 20:43 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Headed to ACL? MaiNLP & our most recent work will be there tooπŸ‘₯πŸ“„
Come see what we’ve been working on!

23.07.2025 12:29 β€” πŸ‘ 14    πŸ” 5    πŸ’¬ 1    πŸ“Œ 2

Looking forward to my visit to Hamburg University and their Data Science group!

11.07.2025 11:06 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image

Joint work with Anyi Wang, @raoyuan.bsky.social , @florian-eichin.com , Jonas Fischer and @barbaraplank.bsky.social 



Check out the paper at arxiv.org/abs/2504.158... or discuss the work with us at #ACL2025 in Vienna.

11.07.2025 10:43 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Through

πŸ“Š 3 new benchmarks with ground truth

πŸ“š evaluation on existing prompt data
πŸ›  demonstration studies, and
β€¨πŸ™‡ a user study

we show how Spotlight can reliably provide new insights and support users uncovering relevant differences on bias, cultural artifacts, language style, model failure,...

11.07.2025 10:43 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

uses data mining + human analysis to supports users in better understanding the behavior of LLM models πŸ”Ž



We leverage token patterns to automatically distinguish between random (decoding) variations and systematic differences in LLM outputs and guide the user in their nuanced analysis.

11.07.2025 10:43 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

What changes if you take the LLM prompt β€œTell me a short story about Dr. Li” and replace β€œDr. Li” with β€œDr. Smith”?

Would you have guessed that this introduces a massive gender bias, from ca. half/half to 99% male doctors?



In our #ACL2025 paper we present the Spotlight framework which...

11.07.2025 10:43 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

uses data mining + human analysis to supports users in better understanding the behavior of LLM models πŸ”Ž

We leverage token patterns to automatically distinguish between random (decoding) variations and systematic differences in LLM outputs and guide the user in their nuanced analysis.

11.07.2025 10:37 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Interpretability meets Discourse. Congratulations to
@florian-eichin.com to his first ACL paper πŸŽ‰

10.07.2025 13:19 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Want to know if your prompting is also affected by this? Addressing this and other issues systematically, we proposed Spotlight, which utilizes data mining to uncover the effects of prompt- and model-changes (meet us at ACL to discuss)
arxiv.org/abs/2504.15815

30.05.2025 14:57 β€” πŸ‘ 7    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

Are you attending NAACL 2025 and are you interested in low-resource languages and dialects?

Then don't miss our very own @verenablaschke.bsky.social's keynote talk at the WNUT 2025 workshop on May 3rd:

Beyond β€œnoisy” text: How (and why) to process dialect data

🌐 β˜€οΈ
noisy-text.github.io/2025/

15.04.2025 21:49 β€” πŸ‘ 17    πŸ” 5    πŸ’¬ 1    πŸ“Œ 0

Happy to be part of that team for almost 1/3 of that time πŸ˜€

01.04.2025 13:02 β€” πŸ‘ 6    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

@mhedderich is following 20 prominent accounts