YouTube video by Google TechTalks
Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training
I gave a talk at the Google Privacy in ML Seminar last summer on privacy & memorization: "Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training".
It's up on YouTube now if you're interested :)
youtu.be/IzIsHFCqXGo?...
18.02.2026 02:05 β π 2 π 2 π¬ 0 π 0
re architecture/training dynamics, this is one of my fav plots that shows how architecture specific inherent biases influence which examples to memorize
bsky.app/profile/jayd...
30.01.2026 22:45 β π 0 π 0 π¬ 0 π 0
Some of these results really changed how i think about memorization. Itβs not simply data. There are many factors (data properties, architecture, training objectives, capacity etc) that interplay to determine what exactly gets memorized.
30.01.2026 22:42 β π 1 π 0 π¬ 1 π 0
Microsoft Research NYC is hiringΒ a researcher in the space of AI and society!
29.01.2026 23:27 β π 62 π 40 π¬ 2 π 2
Paper: arxiv.org/pdf/2601.15394
Joint work with my incredible co-authors: Karan Chadha, Niloofar Mireshghallah, Yuchen Zhang, Irina-Elena Veliche, Archi Mitra, @dasmiq.bsky.social, Zheng Xu, Diego Garcia-Olano!
23.01.2026 20:51 β π 1 π 0 π¬ 0 π 0
Finally, we compare soft (logit-level) vs. hard (sequence-level) distillation. Hard KD is often used when teacher logits are inaccessible. We find that while both show similar rates, hard KD is riskier, inheriting 2.7x more memorization from the teacher than soft KD.
23.01.2026 20:51 β π 0 π 0 π¬ 1 π 0
We compute sequence log-prob & avg shannon entropy. We find cross-entropy pushes the model to overfit on examples it is uncertain about, resulting in forced memorization. In contrast, KD permits it to output a flatter, more uncertain distribution rather than forcing memorization
23.01.2026 20:51 β π 2 π 0 π¬ 1 π 0
Why does distillation reduce memorization vs. fine-tuning w/ cross-entropy? We hypothesize that this could be due to the difference between the hard targets (one-hot labels) of cross-entropy and the soft targets (full probability distribution) of KL Divergence.
23.01.2026 20:51 β π 1 π 0 π¬ 1 π 0
We can predict which examples the student will memorize before distillation! By training a log-reg classifier on features like zlib, KLD loss, PPL, we pre-identify these risks. Removing them from the training results in a significant reduction in memorization (0.07% -> 0.0004%).
23.01.2026 20:51 β π 0 π 0 π¬ 1 π 0
Are these βeasyβ examples universal across architectures (Pythia, OLMo-2, Qwen-3)? We observe that while all models prefer memorizing low entropy data, they donβt **agree** on which examples to memorize. We analyzed cross-model perplexity to decode this selection mechanism.
23.01.2026 20:51 β π 0 π 0 π¬ 1 π 1
This leads us to study why certain examples are easier to memorize? Since our data has no duplicates, duplication isn't the cause. In line with prior work, we compute compressibility (zlib entropy) and perplexity, & find that they are highly correlated with these "easy" examples.
23.01.2026 20:51 β π 0 π 0 π¬ 1 π 0
Next, we find that certain examples are consistently memorized across model sizes within a family because they are **inherently easier to memorize**. We find that distilled models preferentially memorize these easy examples (accounting for over 80% of their total memorization).
23.01.2026 20:51 β π 0 π 0 π¬ 1 π 0
Excited to share my work at Meta.
Knowledge Distillation has been gaining traction for LLM utility. We find that distilled models don't just improve performance, they also memorize significantly less training data than standard fine-tuning (reducing memorization by >50%). π§΅
23.01.2026 20:51 β π 1 π 0 π¬ 1 π 1
very cool work!!!
07.01.2026 23:07 β π 2 π 0 π¬ 0 π 0
Syntax that spuriously correlates with safe domains can jailbreak LLMs - e.g. below with GPT4o mini
Our paper (co w/ Vinith Suriyakumar) on syntax-domain spurious correlations will appear at #NeurIPS2025 as a β¨spotlight!
+ @marzyehghassemi.bsky.social, @byron.bsky.social, Levent Sagun
24.10.2025 16:23 β π 6 π 4 π¬ 3 π 1
CS PhD Statements of Purpose
cs-sop.org is a platform intended to help CS PhD applicants. It hosts a database of example statements of purpose (SoP) shared by previous applicants to Computer Science PhD programs.
It is PhD application season again π For those looking to do a PhD in AI, these are some useful resources π€:
1. Examples of statements of purpose (SOPs) for computer science PhD programs: cs-sop.org [1/4]
01.10.2025 20:37 β π 9 π 4 π¬ 1 π 0
"AI slop" seems to be everywhere, but what exactly makes text feel like "slop"?
In our new work (w/ @tuhinchakr.bsky.social, Diego Garcia-Olano, @byron.bsky.social ) we provide a systematic attempt at measuring AI "slop" in text!
arxiv.org/abs/2509.19163
π§΅ (1/7)
24.09.2025 13:21 β π 31 π 16 π¬ 1 π 1
TALKIN' 'BOUT AI GENERATION: COPYRIGHT AND THE GENERATIVE-AI SUPPLY CHAIN | The Copyright Society
We know copyright
After 2 years in press, it's published!
"Talkin' 'Bout AI Generation: Copyright and the Generative-AI Supply Chain," is out in the 72nd volume of the Journal of the Copyright Society
copyrightsociety.org/journal-entr...
written with @katherinelee.bsky.social & @jtlg.bsky.social (2023)
10.09.2025 19:08 β π 14 π 4 π¬ 1 π 0
it was soo fun!
30.07.2025 04:03 β π 1 π 0 π¬ 0 π 0
Excited to be attending ACL in Vienna next week! Iβll be co-presenting a poster with Niloofar Mireshghallah on our recent PII memorization work on July 29 16:00-17:30 Session 10 Hall 4/5 (& at LLM memorization workshop)!
If you would like to chat memorization/privacy/safety/, please reach out :)
22.07.2025 04:38 β π 0 π 0 π¬ 0 π 0
Big congratulations!! π
22.07.2025 04:35 β π 1 π 0 π¬ 1 π 0
congrats!! π
16.05.2025 17:25 β π 1 π 0 π¬ 1 π 0
big thanks to my wonderful co-authors Matthew Jagielski @katherinelee.bsky.social Niloofar Mireshghallah @dasmiq.bsky.social Christopher A. Choquette-Choo!!
15.05.2025 18:01 β π 2 π 1 π¬ 0 π 0
Privacy Ripple Effects has been accepted to the Findings of ACL 2025! π
See you in Vienna! #ACL2025
15.05.2025 17:24 β π 13 π 3 π¬ 1 π 0
Very excited to be joining Meta GenAI as a Visiting Researcher starting this June in New York City!π½ Iβll be continuing my work on studying memorization and safety in language models.
If youβre in NYC and would like to hang out, please message me :)
15.05.2025 03:18 β π 4 π 0 π¬ 0 π 1
ππ
15.05.2025 02:20 β π 1 π 0 π¬ 0 π 0
I am at CHI this week to present my poster (Framing Health Information: The Impact of Search Methods and Source Types on User Trust and Satisfaction in the Age of LLMs) on Wednesday April 30
CHI Program Link: programs.sigchi.org/chi/2025/pro...
Looking forward to connecting with you all!
29.04.2025 00:50 β π 1 π 1 π¬ 0 π 0
4/26 at 3pm:
'Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon'
USVSN Sai Prashanth Β· @nsaphra.bsky.social et al
Submission: openreview.net/forum?id=3E8...
25.04.2025 17:28 β π 5 π 2 π¬ 1 π 0
Bummed to be missing ICLR, but if youβre interested in all things memorization, stop by poster #200 Hall 3 + Hall 2B on April 26 3-5:30 pm and chat with several of my awesome co-authors.
We propose a taxonomy for different types of memorization in LMs. Paper: openreview.net/pdf?id=3E8YN...
21.04.2025 19:22 β π 3 π 1 π¬ 0 π 0
he/himβtextual technologies enthusiastβAssociate Professor at UIUC's School of Information Sciences & Department of Englishβco-director of the Viral Texts Project (viraltexts.org) & Director @skeuomorphpress.org βalso https://theanxiousbench.bandcamp.com
Postdoc at βͺMila & McGill University π¨π¦ with a PhD in NLP from the University of Edinburgh π΄σ §σ ’σ ³σ £σ ΄σ Ώ memorization vs generalization x (non-)compositionality. she/her π©βπ» π³π±
Stanford Professor of Linguistics and, by courtesy, of Computer Science, and member of @stanfordnlp.bsky.social and The Stanford AI Lab. He/Him/His. https://web.stanford.edu/~cgpotts/
A LLN - large language Nathan - (RL, RLHF, society, robotics), athlete, yogi, chef
Writes http://interconnects.ai
At Ai2 via HuggingFace, Berkeley, and normal places
language model pretraining @ai2.bsky.social, co-lead of data research w/ @soldaini.net, statistics @uw, open science, tabletop, seattle, he/him,π§ kyleclo.com
AI safety at Anthropic, on leave from a faculty job at NYU.
Views not employers'.
I think you should join Giving What We Can.
cims.nyu.edu/~sbowman
Assistant Professor at Bar-Ilan University
https://yanaiela.github.io/
asst prof of computer science at cu boulder
nlp, cultural analytics, narratives, communities
books, bikes, games, art
https://maria-antoniak.github.io
NLP, Linguistics, Cognitive Science, AI, ML, etc.
Job currently: Research Scientist (NYC)
Job formerly: NYU Linguistics, MSU Linguistics
Postdoc at UW NLP ποΈ. #NLProc, computational social science, cultural analytics, responsible AI. she/her. Previously at Berkeley, Ai2, MSR, Stanford. Incoming assistant prof at Wisconsin CS. lucy3.github.io
Tell me about challenges, the unbelievable, the human mind and artificial intelligence, thoughts, social life, family life, science and philosophy.
PhD Student at Northeastern, working to make LLMs interpretable
Postdoc @ Khoury | Previously Ph.D. @ UVA (David Evans) | IIITD Alum | Interested in machine learning privacy & security.
ELLIS PhD Fellow @belongielab.org | @aicentre.dk | University of Copenhagen | @amsterdamnlp.bsky.social | @ellis.eu
Multi-modal ML | Alignment | Culture | Evaluations & Safety| AI & Society
Web: https://www.srishti.dev/
Interdisciplinary researcher interested in modality, discourse, and dialogue.
Assistant Professor of CS
@Khoury College, Northeastern University
https://www.malihealikhani.com/
AI policy researcher, wife guy in training, fan of cute animals and sci-fi. Started a Substack recently: https://milesbrundage.substack.com/
Assoc. Prof in CS @ Northeastern, NLP/ML & health & etc. He/him.
CS PhD students at Northeastern University
Computational Social Science / NLP
Computer science, math, machine learning, (differential) privacy
Researcher at Google DeepMind
Kiwiπ³πΏ in CaliforniaπΊπΈ
http://stein.ke/