Andrea Piergentili's Avatar

Andrea Piergentili

@apierg.bsky.social

PhD student at the University of Trento and @fbk-mt.bsky.social, working on gender-inclusive machine translation (he/him) Applied Scientist Intern at Amazon apierg.github.io #NLP #NLProc #MT

315 Followers  |  275 Following  |  24 Posts  |  Joined: 04.09.2024  |  1.9877

Latest posts by apierg.bsky.social on Bluesky

Post image

@bsavoldi.bsky.social presenting our new multilingual benchmark for evaluating LLMs on gender-neutral translation.

Catch our paper at #EMNLP2025
โ„น๏ธ arxiv.org/pdf/2501.09409

#lt2025fbk

28.10.2025 10:44 โ€” ๐Ÿ‘ 4    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
LT Highlights @ FBK 2025

๐Ÿš€ Join us for the LT@FBK day 2025! Discover cutting-edge research and highlights in speech and language technologies from Fondazione Bruno Kessler (FBK)

๐Ÿ“… October 28, 2025
๐Ÿ“FBK, Trento
โ„น๏ธ lt-highlights.fbk.eu

21.10.2025 10:15 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Last but definitely not least: @bsavoldi.bsky.social presenting joint work with @apierg.bsky.social @matteo-negri.bsky.social @luisabentivogli.bsky.social on scalable gender neutral translation evaluation using LLM-as-a-judge at #GITT2025

23.06.2025 14:07 โ€” ๐Ÿ‘ 10    ๐Ÿ” 3    ๐Ÿ’ฌ 6    ๐Ÿ“Œ 1
Preview
Agree to Disagree? A Meta-Evaluation of LLM Misgendering Numerous methods have been proposed to measure LLM misgendering, including probability-based evaluations (e.g., automatically with templatic sentences) and generation-based evaluations (e.g., with automatic heuristics or human validation). However, it has gone unexamined whether these evaluation methods have convergent validity, that is, whether their results align. Therefore, we conduct a systematic meta-evaluation of these methods across three existing datasets for LLM misgendering. We propose a method to transform each dataset to enable parallel probability- and generation-based evaluation. Then, by automatically evaluating a suite of 6 models from 3 families, we find that these methods can disagree with each other at the instance, dataset, and model levels, conflicting on 20.2% of evaluation instances. Finally, with a human evaluation of 2400 LLM generations, we show that misgendering behaviour is complex and goes far beyond pronouns, which automatic evaluations are not currently designed to capture, suggesting essential disagreement with human evaluations. Based on our findings, we provide recommendations for future evaluations of LLM misgendering. Our results are also more widely relevant, as they call into question broader methodological conventions in LLM evaluation, which often assume that different evaluation methods agree.

Super interesting paper by Subramonian et al: "Agree to Disagree? A Meta-Evaluation of LLM Misgendering" arxiv.org/abs/2504.17075
Turns out, misgendering is messier than just pronouns. I'd love to see this analysis extended to grammatical gender languages! #LLM #AI #ethics @fbk-mt.bsky.social

04.06.2025 14:09 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
Qualtrics Survey | Qualtrics Experience Management The most powerful, simple and trusted way to gather experience data. Start your journey to experience management and try a free account today.

๐Ÿ” Stiamo studiando come l'AI viene usata in Italia e per farlo abbiamo costruito un sondaggio!

๐Ÿ‘‰ bit.ly/sondaggio_ai...

(รจ anonimo, richiede ~10 minuti, e se partecipi o lo fai girare ci aiuti un sacco๐Ÿ™)

Ci interessa anche raggiungere persone che non si occupano e non sono esperte di AI!

03.06.2025 10:24 โ€” ๐Ÿ‘ 16    ๐Ÿ” 18    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
FAMA - a FBK-MT Collection The First Large-Scale Open-Science Speech Foundation Model for English and Italian

๐Ÿš€ New tech report out! Meet FAMA, our open-science speech foundation model family for both ASR and ST in ๐Ÿ‡ฌ๐Ÿ‡ง English and ๐Ÿ‡ฎ๐Ÿ‡น Italian.

The models are live and ready to try on @hf.co:
๐Ÿ”— huggingface.co/collections/...

๐Ÿ“„ Preprint: arxiv.org/abs/2505.22759

#ASR #ST #OpenScience #MultilingualAI

30.05.2025 15:35 โ€” ๐Ÿ‘ 7    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Will do ๐Ÿซก

09.05.2025 08:48 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
a woman is standing in front of a bookshelf in a bookstore and talking about research . ALT: a woman is standing in front of a bookshelf in a bookstore and talking about research .

๐Ÿ‘€ Wanted: #Italian or #Dutch native speakers to take a survey on audiovisual translation for a master thesis student: watch a short video, answer some questions, help academic research ๐Ÿ˜Ž
โฉ Sharing = nice! โค๏ธ
NL link: ugent.qualtrics.com/jfe/form/SV_...
IT link: ugent.qualtrics.com/jfe/form/SV_...

09.05.2025 08:29 โ€” ๐Ÿ‘ 6    ๐Ÿ” 8    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
a man in a suit is making a funny face with the words dreams are expensive behind him ALT: a man in a suit is making a funny face with the words dreams are expensive behind him

๐Ÿ’ญDreaming of attending #GITT2025 but need a little extra ๐Ÿ’ธ boost?
๐Ÿ“ฃ Bursary applications to support participation are now open at tinyurl.com/gitt25
๐Ÿ“† Deadline May 9th
๐Ÿ™Thanks to our incredible sponsors DCA at Tilburg University tinyurl.com/tudca25 and FLW at Ghent University www.ugent.be/lw/en

29.04.2025 14:03 โ€” ๐Ÿ‘ 7    ๐Ÿ” 7    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Reserved topic scholarships | Doctoral Program - Information Engineering and Computer Science

๐Ÿ“ข Come and join our group!
We offer a fully funded 3-year PhD position:

๐Ÿ“” Automatic translation with large multimodal models: iecs.unitn.it/education/ad...

๐Ÿ“Full details for application: iecs.unitn.it/education/ad...

๐Ÿ“… Deadline May 12, 2025

#NLProc #FBK

22.04.2025 10:14 โ€” ๐Ÿ‘ 10    ๐Ÿ” 10    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
An LLM-as-a-judge Approach for Scalable Gender-Neutral Translation Evaluation Gender-neutral translation (GNT) aims to avoid expressing the gender of human referents when the source text lacks explicit cues about the gender of those referents. Evaluating GNT automatically is pa...

Happy to announce that our paper 'An LLM-as-a-judge Approach for Scalable Gender-Neutral Translation Evaluation' was accepted at @gitt-workshop.bsky.social ! ๐Ÿ™Œ

Check it out: arxiv.org/abs/2504.11934 ๐Ÿ”ฅ

Co-authors (๐Ÿซถ๐Ÿป): @bsavoldi.bsky.social, @matteo-negri.bsky.social, @luisabentivogli.bsky.social

17.04.2025 16:29 โ€” ๐Ÿ‘ 10    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Adding Chocolate to Mint: Mitigating Metric Interference in Machine Translation As automatic metrics become increasingly stronger and widely adopted, the risk of unintentionally "gaming the metric" during model development rises. This issue is caused by metric interference (Mint)...

Brilliant and necessary work by Pombal et al. about metric interference in MT system development and evaluation: arxiv.org/abs/2503.08327

Are we developing better systems or are we just gaming the metrics? And how do we address this?
Super (m)interesting! ๐Ÿ‘€

19.03.2025 15:25 โ€” ๐Ÿ‘ 9    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

While we look forward to a sunny Geneva, why wait to join the conversation?

Weโ€™ve created a starter pack for our #GITT2025 friends!
๐Ÿ•ต๏ธ Follow researchers working on gender bias in MT
๐Ÿ’ฌ Stay up to date and dive into the discussion!

All info at sites.google.com/tilburgunive...

28.02.2025 09:22 โ€” ๐Ÿ‘ 22    ๐Ÿ” 16    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Preview
BREAKING NEWS: CDC orders mass retraction and revision of submitted research across all science and medicine journals. Banned terms must be scrubbed. Any unpublished manuscript mentioning certain topics, including gender and "LGBT," must be pulled or revised.

BREAKING NEWS: CDC orders mass retraction and revision of submitted research across all science and medicine journals. Banned terms must be scrubbed.

Goes beyond MMWR +other CDC pubs. Applies to research already submitted to top medical journals.

Take a look.
open.substack.com/pub/insideme...

01.02.2025 21:20 โ€” ๐Ÿ‘ 7734    ๐Ÿ” 4339    ๐Ÿ’ฌ 575    ๐Ÿ“Œ 1644

๐Ÿ™Œ All members of our group are now on Bluesky! ๐Ÿ™Œ

You can find all of us in this starter pack ๐Ÿ‘‡

16.01.2025 09:51 โ€” ๐Ÿ‘ 6    ๐Ÿ” 5    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Looking ahead to 2025, my goal is to keep the momentum and build on this yearโ€™s lessons: being more intentional about time management, becoming a better collaborator, and and carving out time for deep, focused work.

27.12.2024 15:35 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The rest of the year was spent on testing new things, new collaborations, reading and reviewing papers, and traveling around for conferences. No doubt this has been the year where I learned the most so far, and 99% of the learning happened because I had access to some amazing (and patient) people.

27.12.2024 15:35 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

I also developed a demo showcasing gender-neutral translation with LLMs, which I had the chance to present at FBKโ€™s Digital Industry Center Demo Day. Unfortunately the demo is not open to the public for now, but here is a photo of @bsavoldi.bsky.social and me presenting it โœŒ๏ธ

27.12.2024 15:35 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Two key resources enabled the research progress we made this year: GeNTE (2023) and Neo-GATE (2024). They are benchmarks for the conservative and the innovative approach respectively and are both freely available on Hugging Face:

huggingface.co/datasets/FBK...
huggingface.co/datasets/FBK...

27.12.2024 15:35 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

My research topic is gender-inclusive MT, and this year we explored two directions: the "conservative" one with gender-neutral translation and the "innovative" one, using neomorphemes (like ษ™ and *, in Italian). I worked on papers published at venues ranging from top conferences to local workshops.

27.12.2024 15:35 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
but look i made you some is written in white on a black background ALT: but look i made you some is written in white on a black background

With 2024 wrapping up, and given how little Iโ€™ve posted here (or anywhere, really), I thought Iโ€™d share a quick recap of my year and finally make some โœจcontentโœจ

27.12.2024 15:35 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

Our @apierg.bsky.social presenting our #calamita challenges at #CLiCit2024: machine translation and gender-fair generation.

Poster session upcoming, see you there!

For more details:
๐Ÿ‘‰ MagneT: clic2024.ilc.cnr.it/wp-content/u...
๐Ÿ‘‰ GFG: clic2024.ilc.cnr.it/wp-content/u...

06.12.2024 16:22 โ€” ๐Ÿ‘ 9    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Our very own @dennisfucci.bsky.social presenting the challenges of Explainability for Speech Models at #CLiCit2024. If youโ€™re interested, check out the paper ๐Ÿ‘‰ clic2024.ilc.cnr.it/wp-content/u...
#NLProc

05.12.2024 16:04 โ€” ๐Ÿ‘ 9    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Today @luisabentivogli.bsky.social, Dennis Fucci, and @apierg.bsky.social presented a research communication about gender-neutral translation in the morning poster session #CLiCit2024 #NLProc

05.12.2024 10:40 โ€” ๐Ÿ‘ 20    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

If you are in Pisa at #CLiCit2024 don't miss the presentation of our last work today at 12 ๐Ÿ”ฅ

05.12.2024 09:02 โ€” ๐Ÿ‘ 12    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Iโ€™ve created an Italian #NLProc Researcher Starter Pack ๐Ÿ‡ฎ๐Ÿ‡น

DM me to join if you're not in yet!

go.bsky.app/LHbWLHp

04.12.2024 17:04 โ€” ๐Ÿ‘ 24    ๐Ÿ” 5    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

You're in โœŒ๏ธ๐Ÿ˜Ž

04.12.2024 13:49 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Hey, if you're in Pisa and interested to connect with other people at #CLiCit2024 check out this starter pack. Raise a hand ๐Ÿ‘‹ and I will add you to this list!

04.12.2024 11:00 โ€” ๐Ÿ‘ 10    ๐Ÿ” 4    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 0
Dennis Fucci, Andrea Piergentili, and Luisa Bentivogli at CLiC-it 2024 | Machine Translation Unit

We are happy to announce that our PhD students Dennis Fucci and @apierg.bsky.social along with our head of unit @luisabentivogli.bsky.social, will attend #CLiCit2024 in Pisa! Meet them at the poster presentations during the main conference and the #CALAMITA event!

04.12.2024 10:43 โ€” ๐Ÿ‘ 8    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@apierg is following 20 prominent accounts