Desmond Elliott's Avatar

Desmond Elliott

@delliott.bsky.social

394 Followers  |  61 Following  |  34 Posts  |  Joined: 10.09.2023
Posts Following

Posts by Desmond Elliott (@delliott.bsky.social)

Generative AI Archaeology | ICLR Blogposts 2026 We document the rise of the Generative AI Archaeologist, whose tools include linear algebra and probability theory, jailbreaking, and debuggers, compared to the metal detectors, pickaxes, and radar su...

I welcome feedback on this idea and I would love to hear from you if you know about examples of non-LLM model archaeology. The accepted version of the blog is here: iclr-blogposts.github.io/2026/blog/20...

02.03.2026 14:56 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image Post image Post image

In the blog, I discuss papers and personal correspondence with researchers on their findings about inferring training data, model training procedures, stealing deployed models and system prompts.

02.03.2026 14:56 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

I have been thinking about some of the consequences of closed vs open research. Closed research can slow down scientific progress and concentrate knowledge, which results in what I call β€œmodel archaeology”. I discuss this idea in my ICLR 2026 Blogpost.

Short thread 🧡and linkπŸ‘‡

02.03.2026 14:56 β€” πŸ‘ 10    πŸ” 2    πŸ’¬ 1    πŸ“Œ 1
Post image

🚨New paper

Are visual tokens going into an LLM interpretable πŸ€”

Existing methods (e.g. logit lens) and assumptions would lead you to think β€œnot much”...

We propose LatentLens and show that most visual tokens are interpretable across *all* layers πŸ’‘

Details 🧡

11.02.2026 14:12 β€” πŸ‘ 28    πŸ” 6    πŸ’¬ 1    πŸ“Œ 6
A photograph of sunny Copenhagen in the summer!

A photograph of sunny Copenhagen in the summer!

πŸ“’ I am hiring a highly-motivated Ph.D student at the University of Copenhagen to work on tokenization-free NLP.

Read our previous work in this topic: aclanthology.org/2025.emnlp-m...
aclanthology.org/2023.emnlp-m...
openreview.net/forum?id=FkS...

Apply by March 8: employment.ku.dk/phd/?show=1563

04.02.2026 10:40 β€” πŸ‘ 19    πŸ” 9    πŸ’¬ 0    πŸ“Œ 0
LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks There is an increasing trend towards evaluating NLP models with LLMs instead of human judgments, raising questions about the validity of these evaluations, as well as their reproducibility in the case...

πŸ“„Β [ACL 2025 main] LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks (doi.org/10.48550/arX...)

18.07.2025 10:19 β€” πŸ‘ 10    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
Post image Post image Post image

Three invited speakers will share their insights at TokShop! Hear from Yuval Pinter @uvp.bsky.social, Desmond Elliott @delliott.bsky.social, and Adrian ŁaΕ„cuck on cutting-edge tokenization research. Don't miss these keynote presentations! #ICML2025 tokenization-workshop.github.io/speakers

16.07.2025 21:13 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
MAXMINDS 2.0 Homepage MAXMINDS 2.0

It is with great pleasure that I share MAXMINDS 2.0, a new Max Planck program to support scholars in danger of displacement by war or natural disasters, and who have limited access to resources and institutional support.

If you know affected scholars, please share.

www.maxminds.mpg.de

07.07.2025 19:30 β€” πŸ‘ 24    πŸ” 22    πŸ’¬ 0    πŸ“Œ 0
Preview
Postdoc in Natural Language Processing

πŸ“’I am hiring a Postdoc to work on post-training methods for low-resource languages. Apply by August 15 employment.ku.dk/faculty/?sho....
Let's talk at #ACL2025NLP in Vienna if you want to know more about the position and life in Denmark.

07.07.2025 12:47 β€” πŸ‘ 23    πŸ” 12    πŸ’¬ 0    πŸ“Œ 0
Post image

πŸ’‘Beyond math/code, instruction following with verifiable constraints is suitable to be learned with RLVR.
But the set of constraints and verifier functions is limited and most models overfit on IFEval.
We introduce IFBench to measure model generalization to unseen constraints.

03.07.2025 21:06 β€” πŸ‘ 29    πŸ” 5    πŸ’¬ 1    πŸ“Œ 1
Dara

πŸ“£ I am happy to support Ph.D applications to the Danish Advanced Research Academy. My main areas of research include multimodal learning and tokenization-free language processing. Feel free to reach out if you have similar interests! Applications due August 29 www.daracademy.dk/fellowship/f...

26.06.2025 14:40 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Following #CVPR2025, #ICCV2025 implemented a new policy targeting accountability and integrity. PCs identified 25 highly irresponsible reviewers, resulting in the desk rejection of 29 associated papers, including 12 submissions that otherwise would have been accepted.

25.06.2025 18:00 β€” πŸ‘ 21    πŸ” 9    πŸ’¬ 0    πŸ“Œ 1
Sara presenting her poster on reasoning with DeepSeek-R1

Sara presenting her poster on reasoning with DeepSeek-R1

Antonia presenting her poster (not visible in the image)

Antonia presenting her poster (not visible in the image)

The participants brought a lot of energy, enthusiasm, and great posters to highlight their research: @antoniakrm.bsky.social and @saravera.bsky.social pictured.

Finally, I want to think the Danish Data Science Academy, Carlsberg Foundation, and the Villum Foundation for supporting the event!

23.06.2025 15:13 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Huge thanks to everyone that attended the Copenhagen NLP Symposium last week. Thanks for our wonderful speakers @kylelo.bsky.social, @najoung.bsky.social, Yohei Oseki, @mziizm.bsky.social, and @loubnabnl.hf.co! @mariaa.bsky.social did a great job of summarizing the talks in these liveposts (quoted).

23.06.2025 15:13 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Thought-provoking interview with Meg about β€œAGI”

22.06.2025 18:19 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

No, we didn’t record anything but there was an excellent live-poster!

20.06.2025 17:41 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ“― Best Paper Award at CVPR workshop on Visual concepts for our (@doneata.bsky.social + @delliott.bsky.social) paper on probing vision/lang/ vision+lang models for semantic norms!

TLDR: SSL vision models (swinV2, dinoV2) are surprisingly similar to LLM & VLMs even w/o lang πŸ‘€
arxiv.org/abs/2506.03994

13.06.2025 15:15 β€” πŸ‘ 12    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0

Your workshop is so popular that someone managing the door on a one-in one-out basis.

11.06.2025 18:52 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Where and when to find me at #CVPR2025 this week

Where and when to find me at #CVPR2025 this week

I am looking forward to meeting people working on multimodality at #CVPR2025. You can find me hopping between the @vlms4all.bsky.social and Visual Concepts Workshops on Thursday. Feel free to reach out if you want to grab a coffee β˜• or a beer 🍻 during the week!

11.06.2025 00:22 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Announcing our recent work β€œMultilingual Pretraining for Pixel Language Models”! We introduce PIXEL-M4, a pixel language model pretrained on four visually & linguistically diverse scripts: English, Hindi, Ukrainian & Simplified Chinese. #NLProc

04.06.2025 13:44 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1
Paper title "Cultural Evaluations of Vision-Language Models
Have a Lot to Learn from Cultural Theory"

Paper title "Cultural Evaluations of Vision-Language Models Have a Lot to Learn from Cultural Theory"

I am excited to announce our latest work πŸŽ‰ "Cultural Evaluations of Vision-Language Models Have a Lot to Learn from Cultural Theory". We review recent works on culture in VLMs and argue for deeper grounding in cultural theory to enable more inclusive evaluations.

Paper πŸ”—: arxiv.org/pdf/2505.22793

02.06.2025 10:36 β€” πŸ‘ 57    πŸ” 18    πŸ’¬ 3    πŸ“Œ 5
Copenhagen NLP Symposium 2025 symposium website

πŸ“’ The Copenhagen NLP Symposium on June 20th!

- Invited talks by @loubnabnl.hf.co (HF) @mziizm.bsky.social (Cohere) @najoung.bsky.social (BU) @kylelo.bsky.social (AI2) Yohei Oseki (UTokyo)
- Exciting posters by other participants

Register to attend and/or present your poster at cphnlp.github.io /1

26.05.2025 13:08 β€” πŸ‘ 35    πŸ” 12    πŸ’¬ 1    πŸ“Œ 3
Stellen OBP - Georg-August-UniversitΓ€t GΓΆttingen Webseiten der Georg-August-UniversitΓ€t GΓΆttingen

Interested in multilingual tokenization in #NLP? Lisa Beinborn and I are hiring!

PhD candidate position in GΓΆttingen, Germany: www.uni-goettingen.de/de/644546.ht...

PostDoc position in Leuven, Belgium:
www.kuleuven.be/personeel/jo...

Deadline 6th of June

16.05.2025 08:23 β€” πŸ‘ 25    πŸ” 13    πŸ’¬ 2    πŸ“Œ 2

Has anyone written anything about *scraping and text processing* for internet pretraining data? Practical details, which tools are used, which webpage elements are considered, how HTML to text conversion is done?

(I know about work on quality filters, relevant but not quite what I'm looking for)

09.05.2025 10:05 β€” πŸ‘ 39    πŸ” 10    πŸ’¬ 11    πŸ“Œ 2

Thanks for sharing! I'm looking forward to reading this because I enjoyed reading your lecture notes on Natural Language Understanding with Distributed Representation back in the day.

08.05.2025 14:17 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Had fun talking at the Spurious Correlations & Shortcut Learning at ICLR! One example I brought up, which I think provides an uncommon perspective: a case where spurious shortcuts can improve generalization... even to out-of-distribution sets where the spurious feature doesn't generalize! Thread:

01.05.2025 00:32 β€” πŸ‘ 18    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

What would you do if someone has rolled your dataset into their benchmark (cool!) but marked it as being available under a much more permissive license (not so cool)?

14.04.2025 13:44 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Postdoctoral Researcher in Natural Language Processing Postdoc in Natural Language Processing, Department of Computer Science, Faculty of Science, University of Copenhagen The Natural Language Process

I'm recruiting a postdoc on an 18-month contract candidate.hr-manager.net/ApplicationI.... The position is about deploying LLMs in the Danish public sector. This is an interdisciplinary project that touches on technical, ethical, and legal aspects of LLM usage. Apply by 1 May 2025.

12.04.2025 08:48 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Post image

Very excited to release Kaleidoscopeβ€”a multilingual, multimodal evaluation set for VLMs, built as part of our open-science initiative!

🌍 18 languages (high-, mid-, low-)
πŸ“š 21k questions (55% require image understanding)
πŸ§ͺ STEM, social science, reasoning, and practical skills

10.04.2025 19:52 β€” πŸ‘ 10    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸš€ We are excited to introduce Kaleidoscope, the largest culturally-authentic exam benchmark.

πŸ“Œ Most VLM benchmarks are English-centric or rely on translationsβ€”missing linguistic & cultural nuance. Kaleidoscope expands in-language multilingual 🌎 & multimodal πŸ‘€ VLMs evaluation

10.04.2025 20:24 β€” πŸ‘ 18    πŸ” 7    πŸ’¬ 1    πŸ“Œ 2