Martin Gubri's Avatar

Martin Gubri

@mgubri.bsky.social

Research Lead @parameterlab.bsky.social working on Trustworthy AI Speaking πŸ‡«πŸ‡·, English and πŸ‡¨πŸ‡± Spanish | Living in TΓΌbingen πŸ‡©πŸ‡ͺ | he/him https://gubri.eu

108 Followers  |  412 Following  |  32 Posts  |  Joined: 18.11.2024  |  2.3253

Latest posts by mgubri.bsky.social on Bluesky

πŸͺ© New paper out!

Evaluating large models on benchmarks like MMLU is expensive. DISCO cuts costs by up to 99% while still predicting well performance.

πŸ” The trick: use a small subset of samples where models disagree the most. These are the most informative.

Join the dance party below πŸ‘‡

13.10.2025 09:29 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

They found the universal intro for all papers:
<insert name> should be correct. But in reality, that is rarely true.

11.09.2025 15:35 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Thanks a lot Guillaume :)

21.08.2025 16:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

πŸŽ‰ Delighted to announce that our πŸ«—Leaky Thoughts paper about contextual privacy with reasoning models is accepted to #EMNLP main!
Huge congrats to the amazing team Tommaso Green, Haritz Puerto @coallaoh.bsky.social @oodgnas.bsky.social

21.08.2025 15:16 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Fantastic new paper by @reeserichardson.bsky.social et al.

An enormous amount of work showing the extent of coordinated scientific fraud and involvement of some editors.
The number of fraudulent publications grows at a rate far outpacing that of legitimate science.
www.pnas.org/doi/10.1073/...

04.08.2025 21:27 β€” πŸ‘ 134    πŸ” 59    πŸ’¬ 6    πŸ“Œ 4

I agree that there is a gap in the nb of params that a high-end device and a cheap one can run. I guess that "common consumer device" means a mid-range one. But I totally agree that they should specify the type of device: a mobile phone is quite different from a desktop computer.

22.07.2025 09:35 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

My pleasure! Yes, I guess so. I agree that a moving definition can be quite annoying for research. At the same time, I think it is not specific to LMs: a large file, an heavy software, etc. 15 years ago that required a lot of resources back then, is probably be quite small for today's hardware.

22.07.2025 07:20 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Def SLM detailed

Def SLM detailed

There are more details in Appendix A.

21.07.2025 22:27 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
SLM defintion

SLM defintion

This NVIDIA position paper has a clear definition of an SLM: arxiv.org/abs/2506.02153
They consider <10B.
Personally, I would not consider 13B models to be SLMs (not even 7B). They require quite a lot of resources without using aggressive efficient inference techniques (like 4 bits quantization).

21.07.2025 22:24 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

This has been explored quite a lot for the task of jailbreaking an LLM (ie, adversarial examples against LLM alignment). For examples:
- arxiv.org/abs/2310.08419
- arxiv.org/abs/2312.02119
- arxiv.org/abs/2502.01633

16.07.2025 19:12 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

πŸ“’ New paper out: Does SEO work for LLM-based conversational search?

We introduce C-SEO Bench, a benchmark to test if conversational SEO methods actually help.
Our finding? They don't. But traditional SEO still works because LLMs favour content already ranked higher in the prompt.

23.06.2025 16:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Meme: 'EMNLP' crashing in 'The week-end after NeurIPS deadline'

Meme: 'EMNLP' crashing in 'The week-end after NeurIPS deadline'

The mood on a Friday evening

16.05.2025 15:56 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Excited to share that our paper "Scaling Up Membership Inference: When and How Attacks Succeed on LLMs" will be presented next week at #NAACL2025!
πŸ–ΌοΈ Catch us at Poster Session 8 - APP: NLP Applications
πŸ—“οΈ May 2, 11:00 AM - 12:30 PM
πŸ—ΊοΈ Hall 3
Hope to see you there!

26.04.2025 10:11 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

The authors show that LLMs often give opposite answers when forced to answer vs. when not (e.g., open-ended generation). And similarly, the conclusions are highly unstable with prompting.

24.04.2025 15:35 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models Paul RΓΆttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Kirk, Hinrich Schuetze, Dirk Hovy. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Vol...

I agree with the need for transparency. Especially b/c the results seem highly dependent on the evaluation details. There is a really nice ACL 2024 paper about this: aclanthology.org/2024.acl-lon...

24.04.2025 15:35 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

A Bluesky filter to recommend only posts about papers from your followers. This is what I was missing to use Bluesky!

14.03.2025 08:12 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I am pleased to announce that our paper on the scale of LLM membership inference from @parameterlab.bsky.social has been accepted for publication at #NAACL2025 as Findings!

23.01.2025 14:04 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Congrats Michael! πŸ‘πŸŽ‰
Will you stay in Paris?

22.11.2024 19:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

πŸŽ‰We’re pleased to share the release of the models from our ApricotπŸ‘ paper, accepted at ACL 2024!
At Parameter Lab, we believe openness and reproducibility are essential for advancing science, and we've put in our best effort to ensure it.
πŸ€— huggingface.co/collections/...
🧡 bsky.app/profile/dnns...

20.11.2024 23:55 β€” πŸ‘ 9    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

πŸ“„ Excited to share our latest paper on the scale required for successful membership inference in LLMs! We investigate a continuum from single sentences to large document collections. Huge thanks to an incredible team: Haritz Puerto, @coallaoh.bsky.social and @oodgnas.bsky.social!

19.11.2024 14:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 4

Congrats Guillaume! πŸ‘

19.11.2024 13:54 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Have a look at the πŸ‘ Apricot paper that we presented at ACL earlier this year. This project was a wonderful collaboration with @dnnslmr.bsky.social!

18.11.2024 16:57 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

After going to NAACL, ACL and #EMNLP2024 this year, here are a few tips I’ve picked up about attending #NLP conferences.

Would love to hear any other tips if you have them!

This proved very popular on another (more evil) social media platform, so sharing here also πŸ™‚

My 10 tips:

18.11.2024 12:31 β€” πŸ‘ 84    πŸ” 16    πŸ’¬ 14    πŸ“Œ 2

Thanks a lot 😊 I am also happy about what we achieved!

18.11.2024 16:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Overall, we propose a new fingerprinting algorithm for LLM based on prompts suffixes optimized to output an answer chosen at random.
πŸŽ‰A big shoutout to my amazing co-authors from @parameterlab.bsky.social & Naver AI Lab: @dnnslmr.bsky.social, Hwaran Lee, @oodgnas.bsky.social @coallaoh.bsky.social!

18.11.2024 15:46 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

πŸ›‘οΈNevertheless, the third party can deploy the reference LLM with changes, so we explore the robustness of our identification:
- TRAP is robust to generation hyperparameters (usual ranges)
- TRAP is not robust to some system prompts

18.11.2024 15:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

πŸ€” You may wonder: what if two models are trained on the same data? It turns out that none of our suffix optimized on Llama2-7B-chat transfer to its 13B sibling (also true for VicuΓ±a and Guanaco). So our suffixes are specific to the specific weights of the ref LLM.

18.11.2024 15:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

TRAP beats the perplexity baseline using less output tokens (3-18 tokens vs. 150 tokens). And perplexity identification is sensitive to the type of prompt.

18.11.2024 15:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

It turns out that this suffix is specific to the reference model. So we can use it as a fingerprint.
- The suffix forces the ref LLM to output the target number 95-100% of the time
- The suffix is specific to the ref LLM (<1% average transfer rate to another LLM)

18.11.2024 15:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
GCG use for TRAP

GCG use for TRAP

In practice, we ask the LLM for a random number and try to force its answer using a suffix prompt. We first sample a random target number. Then we tune the suffix so the reference LLM output this specific number. We repurpose GCG originally designed for jailbreaking.

18.11.2024 15:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@mgubri is following 20 prominent accounts