Verna Dankers's Avatar

Verna Dankers

@vernadankers.bsky.social

Postdoc at β€ͺMila & McGill University πŸ‡¨πŸ‡¦ with a PhD in NLP from the University of Edinburgh 🏴󠁧󠁒󠁳󠁣󠁴󠁿 memorization vs generalization x (non-)compositionality. she/her πŸ‘©β€πŸ’» πŸ‡³πŸ‡±

306 Followers  |  278 Following  |  12 Posts  |  Joined: 19.11.2024  |  1.5217

Latest posts by vernadankers.bsky.social on Bluesky

Post image Post image Post image

Full house at BlackboxNLP at #EMNLP2025!! Getting ready for my 1.45PM keynote 😎 Join us in A102 to learn about "Memorization: myth or mystery?"

09.11.2025 03:04 β€” πŸ‘ 12    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image Post image Post image

πŸ† full house at the IJCAI-JAIR Best Paper Award #talk delivered by Verna Dankers on Compositionality Decomposed: How do Neural Networks Generalise?! #IJCAI2025

20.08.2025 15:43 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Thanks Jelle!! I'll make sure to prominently feature UvA in the talk tomorrow 😎 all discussions for Compositionality Decomposed were held back at Nikhef!

20.08.2025 03:57 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Congratulations to the winners of the 2025 IJCAI–JAIR Prize for their paper β€œCompositionality Decomposed: How Do Neural Networks Generalise?” β€” Dieuwke Hupkes, Verna Dankers, Mathijs Mul, and Elia Bruni! Presented by Edith Elkind, Northwestern University arxiv.org/abs/1908.08351
#IJCAI2025

19.08.2025 19:11 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

Proud to accept a 5y outstanding paper award
@ijcai.org πŸ† from the Journal of AI Research for the impact Compositionality Decomposed has had, on behalf of the team (Dieuwke Hupkes, Elia Bruni & Mathijs Mul)!🧑 Come to room 513 on Wed@11.30 to learn about rethinking compgen evaluation in the LLM eraπŸ€–

19.08.2025 18:27 β€” πŸ‘ 14    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image Post image Post image Post image

Had a blast at #ACL2025 connecting with new/familiar faces, explaining the many affiliations on my badge, chatting about memorisation--generalisation, and visiting stunning Schânbrunn. A shoutout to the co-organisers & speakers of the successful @l2m2workshop.bsky.social 🧑 I learnt a lot from you!

04.08.2025 02:36 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Main takeaway: students excel both by mimicking teachers and deviating from them & be careful with distillation, students may inherit teachers’ pros and cons. Work done during a Microsoft internship. Find me πŸ‘‹πŸΌ Weds at 11 in poster session 4 arxiv.org/pdf/2502.01491 (5/5)

27.07.2025 15:37 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

...identifying scenarios in which students *outperform* teachers through SeqKD’s amplified denoising effects. Lastly, we establish that through AdaptiveSeqKD (briefly finetuning your teacher prior to distillation with HQ data) you can strongly decrease memorization and hallucinations. (4/5)

27.07.2025 15:37 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Students showed increases in verbatim memorization, large increases in extractive memorization, and also hallucinated much more than baselines! We go beyond average-case performance through additional analyses of how T, B, and S perform on data subgroups... (3/5)

27.07.2025 15:37 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

SeqKD is still widely applied in NMT to obtain strong & small deployable systems, but your student models do not only inherit good things from teachers. In our short paper, we contrast baselines trained on the original corpus to students (trained on teacher-generated targets). (2/5)

27.07.2025 15:37 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Thrilled to be in Vienna to learn about all the papers & catch up with NLP friends πŸ‡¦πŸ‡Ή You'll find me at the ACL mentorship session (Mon 2PM), at our @l2m2workshop.bsky.social (Fri) and at my poster (Wed 11AM)! I'll present work w/ @vyraun.bsky.social on memorization in NMT under SeqKD #ACL2025 (1/5)

27.07.2025 15:37 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I miss Edinburgh and its wonderful people already!! Thanks to @tallinzen.bsky.social and @edoardo-ponti.bsky.social for inspiring discussions during the viva! I'm now exchanging Arthur's Seat for Mont Royal to join @sivareddyg.bsky.social's wonderful lab @mila-quebec.bsky.social 🀩

01.07.2025 21:33 β€” πŸ‘ 15    πŸ” 1    πŸ’¬ 3    πŸ“Œ 0
A circular diagram with a blue whale icon at the center. The diagram shows 8 interconnected research areas around LLM reasoning represented as colored rectangular boxes arranged in a circular pattern. The areas include: Β§3 Analysis of Reasoning Chains (central cloud), Β§4 Scaling of Thoughts (discussing thought length and performance metrics), Β§5 Long Context Evaluation (focusing on information recall), Β§6 Faithfulness to Context (examining question answering accuracy), Β§7 Safety Evaluation (assessing harmful content generation and jailbreak resistance), Β§8 Language & Culture (exploring moral reasoning and language effects), Β§9 Relation to Human Processing (comparing cognitive processes), Β§10 Visual Reasoning (covering ASCII generation capabilities), and Β§11 Following Token Budget (investigating direct prompting techniques). Arrows connect the sections in a clockwise flow, suggesting an iterative research methodology.

A circular diagram with a blue whale icon at the center. The diagram shows 8 interconnected research areas around LLM reasoning represented as colored rectangular boxes arranged in a circular pattern. The areas include: Β§3 Analysis of Reasoning Chains (central cloud), Β§4 Scaling of Thoughts (discussing thought length and performance metrics), Β§5 Long Context Evaluation (focusing on information recall), Β§6 Faithfulness to Context (examining question answering accuracy), Β§7 Safety Evaluation (assessing harmful content generation and jailbreak resistance), Β§8 Language & Culture (exploring moral reasoning and language effects), Β§9 Relation to Human Processing (comparing cognitive processes), Β§10 Visual Reasoning (covering ASCII generation capabilities), and Β§11 Following Token Budget (investigating direct prompting techniques). Arrows connect the sections in a clockwise flow, suggesting an iterative research methodology.

Models like DeepSeek-R1 πŸ‹ mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1’s reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour.
πŸ”—: mcgill-nlp.github.io/thoughtology/

01.04.2025 20:06 β€” πŸ‘ 52    πŸ” 16    πŸ’¬ 1    πŸ“Œ 9

πŸ‘‹

11.03.2025 01:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Super excited about the First L2M2 (Large Language Model Memorization) workshop at #ACL2025 in Vienna @l2m2workshop.bsky.social! πŸ₯³ Submit via ARR in February, or directly to the workshop in March! Archival and non-archival contributions are welcome. πŸ—’οΈ sites.google.com/view/memoriz...

27.01.2025 22:03 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

thinking of calling this "The Illusion Illusion"

(more examples below)

01.12.2024 14:33 β€” πŸ‘ 1585    πŸ” 389    πŸ’¬ 60    πŸ“Œ 91

@vernadankers is following 20 prominent accounts