Philipp Mondorf's Avatar

Philipp Mondorf

@pmondorf.bsky.social

PhD student @MaiNLP (Munich AI & NLP lab), @LMU. Working on reasoning in large language models.

174 Followers  |  224 Following  |  12 Posts  |  Joined: 22.11.2024  |  1.9825

Latest posts by pmondorf.bsky.social on Bluesky

๐Ÿ‘ฅ @veraneplenbroek.bsky.social, Sandro Pezelle, @barbaraplank.bsky.social, @davidschlangen.bsky.social, Alessandro Suglia, @akskuchi.bsky.social, @ecekt.bsky.social, and @alberto-testoni.bsky.social.
๐Ÿ“Poster Session 2 โ€” Hall 4/5, 11:00โ€“12:30, Monday, July 28.

#MaiNLP #MCML #NLProc

18.07.2025 10:19 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

๐Ÿ‘ฅ Special thanks to @annabavaresco.bsky.social, @raffagbernardi.bsky.social, @leobertolazzi.bsky.social, @delliott.bsky.social, Raquel Fernรกndez, Albert Gatt, @esamghaleb.bsky.social, Mario Giulianelli, @michaelwhanna.bsky.social, @akoller.bsky.social, @andre-t-martins.bsky.social

18.07.2025 10:19 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐Ÿ‘ฅย This work is the result of a wonderful collaboration involving 20 researchers from 11 different universities.

18.07.2025 10:19 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐Ÿ”ŽBased on evaluations across 11 recent LLMs, we find that model judgments should be used with care, as they exhibit notable variability depending on the task and samples being evaluated. We argue that LLMs should be carefully validated against human judgments before being used as evaluators.

18.07.2025 10:19 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐Ÿ”Žย In this work, we study whether LLM judgments can be reliably used as proxies for human judgments. We introduce JUDGE-BENCH, an extensive collection of 20 datasets with human annotations covering a variety of NLP tasks.

18.07.2025 10:19 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks There is an increasing trend towards evaluating NLP models with LLMs instead of human judgments, raising questions about the validity of these evaluations, as well as their reproducibility in the case...

๐Ÿ“„ย [ACL 2025 main] LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks (doi.org/10.48550/arX...)

18.07.2025 10:19 โ€” ๐Ÿ‘ 9    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐Ÿ‘ฅย Huge thanks to my collaborators and co-authors, Sondre Wold and @barbaraplank.bsky.social
๐Ÿ“Poster Session 7 โ€” Hall 4/5, 10:30โ€“12:00, Tuesday, July 29.

18.07.2025 10:19 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐Ÿ”Ž Moreover, we show that these circuits can be reused and combined through set operations to represent more complex functional capabilities of the model. For more information, check out the paper!

18.07.2025 10:19 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐Ÿ”Žย In this work, we study the relationship between transformer circuits identified for highly compositional and functionally related tasks. We find that functionally similar circuits exhibit both notable node overlap and cross-task faithfulness.

18.07.2025 10:19 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models A fundamental question in interpretability research is to what extent neural networks, particularly language models, implement reusable functions through subnetworks that can be composed to perform mo...

๐Ÿ“„ย [ACL 2025 main] Circuit compositions: Exploring Modular Structures in Transformer-Based Language Models (doi.org/10.48550/arX...)

18.07.2025 10:19 โ€” ๐Ÿ‘ 5    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I am happy to share that Iโ€™ll be attending #ACL2025 in Vienna ๐Ÿ‡ฆ๐Ÿ‡น, where Iโ€™ll be presenting two papers (more information below)!

18.07.2025 10:19 โ€” ๐Ÿ‘ 11    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
The hand-drawn sign from three years ago.

The hand-drawn sign from three years ago.

๐ŸŽ‰MaiNLP is turning 3 today!๐ŸŽ‚๐Ÿฅณ Weโ€™ve grown a lot since @barbaraplank.bsky.social started this group with nothing but three aspiring researches and a hand-drawn sign on the door. Huge thanks to all the amazing people who have joined or visited us since. Hereโ€™s to many more years of exciting research!๐Ÿš€

01.04.2025 10:40 โ€” ๐Ÿ‘ 19    ๐Ÿ” 9    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2

๐Ÿ™‹โ€โ™‚๏ธ

25.11.2024 18:03 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@pmondorf is following 19 prominent accounts