Full house at BlackboxNLP at #EMNLP2025!! Getting ready for my 1.45PM keynote π Join us in A102 to learn about "Memorization: myth or mystery?"
09.11.2025 03:04 β π 12 π 1 π¬ 0 π 0@vernadankers.bsky.social
Postdoc at βͺMila & McGill University π¨π¦ with a PhD in NLP from the University of Edinburgh π΄σ §σ ’σ ³σ £σ ΄σ Ώ memorization vs generalization x (non-)compositionality. she/her π©βπ» π³π±
Full house at BlackboxNLP at #EMNLP2025!! Getting ready for my 1.45PM keynote π Join us in A102 to learn about "Memorization: myth or mystery?"
09.11.2025 03:04 β π 12 π 1 π¬ 0 π 0π full house at the IJCAI-JAIR Best Paper Award #talk delivered by Verna Dankers on Compositionality Decomposed: How do Neural Networks Generalise?! #IJCAI2025
20.08.2025 15:43 β π 5 π 1 π¬ 0 π 0Thanks Jelle!! I'll make sure to prominently feature UvA in the talk tomorrow π all discussions for Compositionality Decomposed were held back at Nikhef!
20.08.2025 03:57 β π 0 π 0 π¬ 0 π 0Congratulations to the winners of the 2025 IJCAIβJAIR Prize for their paper βCompositionality Decomposed: How Do Neural Networks Generalise?β β Dieuwke Hupkes, Verna Dankers, Mathijs Mul, and Elia Bruni! Presented by Edith Elkind, Northwestern University arxiv.org/abs/1908.08351
#IJCAI2025
Proud to accept a 5y outstanding paper award
@ijcai.org π from the Journal of AI Research for the impact Compositionality Decomposed has had, on behalf of the team (Dieuwke Hupkes, Elia Bruni & Mathijs Mul)!π§‘ Come to room 513 on Wed@11.30 to learn about rethinking compgen evaluation in the LLM eraπ€
Had a blast at #ACL2025 connecting with new/familiar faces, explaining the many affiliations on my badge, chatting about memorisation--generalisation, and visiting stunning SchΓΆnbrunn. A shoutout to the co-organisers & speakers of the successful @l2m2workshop.bsky.social π§‘ I learnt a lot from you!
04.08.2025 02:36 β π 7 π 0 π¬ 0 π 0Main takeaway: students excel both by mimicking teachers and deviating from them & be careful with distillation, students may inherit teachersβ pros and cons. Work done during a Microsoft internship. Find me ππΌ Weds at 11 in poster session 4 arxiv.org/pdf/2502.01491 (5/5)
27.07.2025 15:37 β π 0 π 0 π¬ 0 π 0...identifying scenarios in which students *outperform* teachers through SeqKDβs amplified denoising effects. Lastly, we establish that through AdaptiveSeqKD (briefly finetuning your teacher prior to distillation with HQ data) you can strongly decrease memorization and hallucinations. (4/5)
27.07.2025 15:37 β π 0 π 0 π¬ 1 π 0Students showed increases in verbatim memorization, large increases in extractive memorization, and also hallucinated much more than baselines! We go beyond average-case performance through additional analyses of how T, B, and S perform on data subgroups... (3/5)
27.07.2025 15:37 β π 0 π 0 π¬ 1 π 0SeqKD is still widely applied in NMT to obtain strong & small deployable systems, but your student models do not only inherit good things from teachers. In our short paper, we contrast baselines trained on the original corpus to students (trained on teacher-generated targets). (2/5)
27.07.2025 15:37 β π 0 π 0 π¬ 1 π 0Thrilled to be in Vienna to learn about all the papers & catch up with NLP friends π¦πΉ You'll find me at the ACL mentorship session (Mon 2PM), at our @l2m2workshop.bsky.social (Fri) and at my poster (Wed 11AM)! I'll present work w/ @vyraun.bsky.social on memorization in NMT under SeqKD #ACL2025 (1/5)
27.07.2025 15:37 β π 8 π 0 π¬ 1 π 0I miss Edinburgh and its wonderful people already!! Thanks to @tallinzen.bsky.social and @edoardo-ponti.bsky.social for inspiring discussions during the viva! I'm now exchanging Arthur's Seat for Mont Royal to join @sivareddyg.bsky.social's wonderful lab @mila-quebec.bsky.social π€©
01.07.2025 21:33 β π 15 π 1 π¬ 3 π 0A circular diagram with a blue whale icon at the center. The diagram shows 8 interconnected research areas around LLM reasoning represented as colored rectangular boxes arranged in a circular pattern. The areas include: Β§3 Analysis of Reasoning Chains (central cloud), Β§4 Scaling of Thoughts (discussing thought length and performance metrics), Β§5 Long Context Evaluation (focusing on information recall), Β§6 Faithfulness to Context (examining question answering accuracy), Β§7 Safety Evaluation (assessing harmful content generation and jailbreak resistance), Β§8 Language & Culture (exploring moral reasoning and language effects), Β§9 Relation to Human Processing (comparing cognitive processes), Β§10 Visual Reasoning (covering ASCII generation capabilities), and Β§11 Following Token Budget (investigating direct prompting techniques). Arrows connect the sections in a clockwise flow, suggesting an iterative research methodology.
Models like DeepSeek-R1 π mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1βs reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour.
π: mcgill-nlp.github.io/thoughtology/
π
11.03.2025 01:02 β π 1 π 0 π¬ 0 π 0Super excited about the First L2M2 (Large Language Model Memorization) workshop at #ACL2025 in Vienna @l2m2workshop.bsky.social! π₯³ Submit via ARR in February, or directly to the workshop in March! Archival and non-archival contributions are welcome. ποΈ sites.google.com/view/memoriz...
27.01.2025 22:03 β π 2 π 0 π¬ 0 π 0thinking of calling this "The Illusion Illusion"
(more examples below)