We have updated the pre-print on CUS-QA, benchmark for regional knowledge about Czechia, Slovakia and Ukraine arxiv.org/abs/2507.22752
Now, there are results of retrieval-augmented generation and more detailed analysis of model performance depending on the topic of the question or visual context.
03.02.2026 21:49 — 👍 6 🔁 1 💬 0 📌 0
👉 What do we do?
We use the good old IBM1 model to align subwords with morphological features from Unimorph and we show it captures the same thing as morpheme boundary recall.
👉 Why it matters?
For many languages good segmentation data is missing. Morphological features are more widely available.
02.02.2026 13:38 — 👍 2 🔁 1 💬 0 📌 0
Happy holidays! 🎄🎅🤩🎁
23.12.2025 10:51 — 👍 3 🔁 0 💬 0 📌 0
Attenzione! 🇮🇹 Know Piedmontese or Neapolitan speakers? @gianlucavico.bsky.social is collecting crowd-sourced translations to evaluate LLM performance on these regional languages. Partecipate!
10.11.2025 14:36 — 👍 2 🔁 1 💬 0 📌 0
Cultural awareness is trickier. Different data for different cultures means we can't really compare performance across cultures in a straightforward way. And there's no clear optimization target for cultural awareness beyond curating diverse training data.
21.10.2025 13:30 — 👍 1 🔁 0 💬 0 📌 0
☝️🧵 Most current approaches emphasize langauge neutrality: about two-thirds of VL benchmarks use translation-based evaluation. This makes sense because we can explicitly train for language neutrality when we have parallel data. But... 🧵👇
21.10.2025 13:30 — 👍 0 🔁 0 💬 1 📌 0
With @andrei-a-manea.bsky.social, we posted a survey on multilingual vision-language models 👉 arxiv.org/pdf/2509.22123
We reviewed 31 models+21 benchmarks. There's a tension between language neutrality (same results across languages) & cultural awareness (context matters differently across cultures)
21.10.2025 13:30 — 👍 3 🔁 2 💬 1 📌 0
Most vision-language models only work in English. We explore how different parallel data types (machine-translated vs authentic captions) affect cross-lingual transfer. Key finding: authentic data can outperform machine translation, and multilingual training beats bilingual approaches. #NLP
01.09.2025 15:38 — 👍 2 🔁 0 💬 0 📌 0
So proud of my PhD student @andrei-a-manea.bsky.social for his first first-author publication! 🎉 He presented this work last week at TSD. Investigating the Effect of Parallel Data in the Cross-Lingual Transfer for Vision-Language Encoders arxiv.org/pdf/2504.21681
01.09.2025 15:38 — 👍 6 🔁 0 💬 1 📌 0
For evaluation researchers: Simple string-overlap metrics (BLEU, chrF) work surprisingly well for factual QA. 🤔 When answers are mostly named entities, exact matches matter more than we thought.
LLM-as-judge 🦙🧑⚖️ correlates best with human judgment, though.
25.08.2025 08:06 — 👍 1 🔁 0 💬 1 📌 0
The results are... humbling 😅
Even the best models:
>40% accuracy on textual questions
<30% on visual questions
Often perform better in English than the local language (!!)
Visual QA with regional images is especially challenging.
25.08.2025 08:06 — 👍 0 🔁 0 💬 1 📌 0
The problem: Most QA benchmarks focus on globally known facts. But real users ask about local geography, culture, and history.
We collected questions from native speakers in Czechia 🇨🇿, Slovakia 🇸🇰, and Ukraine 🇺🇦 about facts locals know but outsiders don't.
25.08.2025 08:06 — 👍 0 🔁 0 💬 1 📌 0
ufal/cus-qa · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
🧵 We're releasing CUS-QA - a new benchmark for testing LLMs on regional knowledge!
Find out what your model knows about Czechia 🇨🇿, Slovakia 🇸🇰, and Ukraine 🇺🇦!
👉 Textual and visual questions, answers, and human judgment on model outputs!
huggingface.co/datasets/ufa...
www.arxiv.org/abs/2507.22752
25.08.2025 08:06 — 👍 16 🔁 3 💬 1 📌 3
Stay tuned, we will release the dataset soon...
01.08.2025 16:49 — 👍 2 🔁 0 💬 0 📌 0
We need to have poster fights at the end of every conference.
29.07.2025 19:01 — 👍 3 🔁 1 💬 0 📌 0
Just presented MAGBIG, a new dataset and evaluation methodology for gender bias in multilingual text-to-image generation. Grammatical gender matters when studying these biases across languages!
Thanks to Felix Friedrich, @kathaem.bsky.social and all co-authors - it was fun to work on this together!
28.07.2025 13:14 — 👍 2 🔁 0 💬 0 📌 0
This week I am at #ACL2025NLP in Vienna 🎡🇦🇹. Find me 🕵️ or message 💌 me if you want to chat about multilinguality or tokenization. Stop 🛑 by our poster on gender bias in text-to-image generation on Monday aclanthology.org/2025.acl-lon...
27.07.2025 07:24 — 👍 7 🔁 0 💬 0 📌 0
TokShop 2025
Registering interest in all things tokenization at TokShop @ ICML 2025 (July 18)
Consider joining the Google group for future updates!
https://groups.google.com/g/tokshop
TokShop @ #ICML2025 got way more submissions than expected! 📈 We could really use a few more reviewers to help out. If you have the capacity to review a #tokenization paper by Saturday, please fill out this form: forms.gle/32A6sQHQrMSb... 🙏
02.06.2025 16:40 — 👍 0 🔁 4 💬 0 📌 2
ICML 2025 Workshop TokShop
Welcome to the OpenReview homepage for ICML 2025 Workshop TokShop
📣 Call for Paper Alert: TokShop @ ICML 2025
TokShop explores tokenization across all data modalities. Topics include: subword NLP techniques, multimodal approaches, multilingual challenges, post-training modification, alternative representations, and statistical perspectives.
14.05.2025 13:31 — 👍 18 🔁 12 💬 1 📌 2
Tokenization Workshop @ ICML 2025
Got a tokenization paper that just didn't make the cut for ICML? Submit it to the Tokenization Workshop TokShop at #ICML2025 -- we'd love to see it there!
tokenization-workshop.github.io
04.05.2025 19:27 — 👍 7 🔁 6 💬 0 📌 0
Attending #NAACL2025 virtually. Since 2022, I've been training a classifier on papers I read to tackle the arXiv madness. Ran it on the NAACL proceedings for my personalized watch list. 🤓📺 However, it's far from perfect: Multilingual cultural awareness is great, but where is tokenization? 🤷
30.04.2025 12:50 — 👍 2 🔁 0 💬 2 📌 0
We're organizing ✨Tokenization Workhop✨ TokShop❗ Join us at @icmlconf.bsky.social in July in Vancouver 🇨🇦. Follow @tokshop.bsky.social for updates! Submit your paper by May 30.
15.04.2025 17:37 — 👍 4 🔁 0 💬 0 📌 0
Random take on the #TuringTest: Rather than testing machine intelligence, it can be a measure of societal awareness about #AI capabilities. The real objective isn't creating a machine that passes but educating people to think critically and avoid being deceived, so the machines do not pass the test.
04.04.2025 19:37 — 👍 4 🔁 0 💬 0 📌 0
Our paper 'Beyond Literal Token Overlap: Token Alignability for Multilinguality' will be at #NAACL2025! We show that token alignability is a stronger predictor of cross-lingual transfer than literal token overlap.
Read it here: arxiv.org/abs/2502.06468
10.03.2025 15:48 — 👍 6 🔁 1 💬 0 📌 1
Welcome to SemEval-2025 Task-3 — Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes
Join Mu-SHROOM 🍄, a SemEval 2025 shared task on detecting hallucination spans in multilingual LLM outputs! 🌍 Includes Czech with regional Czech questions 🇨🇿. Do you think you can spot when something isn’t true? 🤔 Try it out! 👉 helsinki-nlp.github.io/shroom #SemEval2025 #NLP
14.01.2025 15:56 — 👍 4 🔁 0 💬 0 📌 1
Happy holidays! 🎄🎅🤩🎁
24.12.2024 13:36 — 👍 8 🔁 0 💬 0 📌 0
Host of scientistic papers for @aclmeeting.bsky.social and other venues in the field of natural language processing. https://aclanthology.org/
#NLP #NLProc
Assistant Professor at Department of Computer Science, University of Copenhagen. Cross-cultural Natural Language Processing, food and positive impact are my passion. Dad, kendoka and language learner. danielhers.github.io
Parker Distinguished Professor, @UNC. Program Chair #EMNLP2024. Director http://MURGeLab.cs.unc.edu (@uncnlp). @Berkeley_AI @TTIC_Connect @IITKanpur
#NLP #CV #AI #ML
https://www.cs.unc.edu/~mbansal/
PhD Student of NLP | Researching Semantic Accuracy of Text Generation
Prof, Chair for AI & Computational Linguistics,
Head of MaiNLP lab @mainlp.bsky.social, LMU Munich
Co-director CIS @cislmu.bsky.social
Visiting Prof ITU Copenhagen @itu.dk
ELLIS Fellow @ellis.eu
Vice-President ACL
PI MCML @munichcenterml.bsky.social
PhD student @mainlp.bsky.social (@cislmu.bsky.social, LMU Munich). Interested in language variation & change, currently working on NLP for dialects and low-resource languages.
verenablaschke.github.io
NLP, Linguistics, Cognitive Science, AI, ML, etc.
Job currently: Research Scientist (NYC)
Job formerly: NYU Linguistics, MSU Linguistics
Center for Information and Language Processing (CIS): NLP research group at LMU Munich led by Hinrich Schuetze and @barbaraplank.bsky.social
PhD Student @HelsinkiNLP / Low-resource, Machine Translation, Knowledge Distillation, Multilinguality
Organized and sponsored by SIGLEX, the Special Interest Group of the ACL, *SEM brings together researchers interested in the semantics of natural languages and its computational modeling.
*SEM 2026: https://starsem2026.github.io
Assistant Professor at Bocconi University in MilaNLP group • Working in #NLP, #CSS and #Ethics • She/her • #ERCStG PERSONAE
Postdoc at ETH. Formerly, PhD student at the University of Cambridge :)
Associate professor at CMU, studying natural language processing and machine learning. Co-founder All Hands AI
PhD student @MaiNLP (Munich AI & NLP lab), @LMU.
Working on reasoning in large language models.
Postdoc in NLP @milanlp.bsky.social (Milan) and @nlpnorth.bsky.social (Copenhagen) | affiliated @aicentre.dk | past @mainlp.bsky.social, Amazon Alexa
🔗 elisabassignana.github.io
Professor of Data Science
Lead of @ds-hamburg.bsky.social
Researching Safe Generative AI
ACL Rolling Review (https://aclrollingreview.org)
Tweets by the ARR Communications / Support Team
#NLProc research group @itu.dk (Copenhagen, Denmark)
🔗 nlpnorth.github.io
Postdoc @TUM Heilbronn
NLP for low-resource languages
Computational Language Documentation, Morphology, Gloss
PhD student @ Charles University. Working on evaluation, explainability, and reasoning in NLP.