Jaydeep Borkar's Avatar

Jaydeep Borkar

@jaydeepborkar.bsky.social

PhD Candidate at Northeastern / Incoming Research Intern + ex-Visiting Researcher at Meta (MSL) / Organizer at the Trustworthy ML Initiative (trustworthyml.org). safety & privacy in language models + mountain biking. jaydeepborkar.github.io

40 Followers  |  40 Following  |  46 Posts  |  Joined: 27.12.2024  |  2.1089

Latest posts by jaydeepborkar.bsky.social on Bluesky


Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training
YouTube video by Google TechTalks Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training

I gave a talk at the Google Privacy in ML Seminar last summer on privacy & memorization: "Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training".

It's up on YouTube now if you're interested :)
youtu.be/IzIsHFCqXGo?...

18.02.2026 02:05 β€” πŸ‘ 2    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

re architecture/training dynamics, this is one of my fav plots that shows how architecture specific inherent biases influence which examples to memorize
bsky.app/profile/jayd...

30.01.2026 22:45 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Some of these results really changed how i think about memorization. It’s not simply data. There are many factors (data properties, architecture, training objectives, capacity etc) that interplay to determine what exactly gets memorized.

30.01.2026 22:42 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Microsoft Research NYC is hiringΒ a researcher in the space of AI and society!

29.01.2026 23:27 β€” πŸ‘ 62    πŸ” 40    πŸ’¬ 2    πŸ“Œ 2

Paper: arxiv.org/pdf/2601.15394

Joint work with my incredible co-authors: Karan Chadha, Niloofar Mireshghallah, Yuchen Zhang, Irina-Elena Veliche, Archi Mitra, @dasmiq.bsky.social, Zheng Xu, Diego Garcia-Olano!

23.01.2026 20:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Finally, we compare soft (logit-level) vs. hard (sequence-level) distillation. Hard KD is often used when teacher logits are inaccessible. We find that while both show similar rates, hard KD is riskier, inheriting 2.7x more memorization from the teacher than soft KD.

23.01.2026 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

We compute sequence log-prob & avg shannon entropy. We find cross-entropy pushes the model to overfit on examples it is uncertain about, resulting in forced memorization. In contrast, KD permits it to output a flatter, more uncertain distribution rather than forcing memorization

23.01.2026 20:51 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Why does distillation reduce memorization vs. fine-tuning w/ cross-entropy? We hypothesize that this could be due to the difference between the hard targets (one-hot labels) of cross-entropy and the soft targets (full probability distribution) of KL Divergence.

23.01.2026 20:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We can predict which examples the student will memorize before distillation! By training a log-reg classifier on features like zlib, KLD loss, PPL, we pre-identify these risks. Removing them from the training results in a significant reduction in memorization (0.07% -> 0.0004%).

23.01.2026 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Are these β€œeasy” examples universal across architectures (Pythia, OLMo-2, Qwen-3)? We observe that while all models prefer memorizing low entropy data, they don’t **agree** on which examples to memorize. We analyzed cross-model perplexity to decode this selection mechanism.

23.01.2026 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1
Post image

This leads us to study why certain examples are easier to memorize? Since our data has no duplicates, duplication isn't the cause. In line with prior work, we compute compressibility (zlib entropy) and perplexity, & find that they are highly correlated with these "easy" examples.

23.01.2026 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Next, we find that certain examples are consistently memorized across model sizes within a family because they are **inherently easier to memorize**. We find that distilled models preferentially memorize these easy examples (accounting for over 80% of their total memorization).

23.01.2026 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image Post image

We find the student recovers 78% of teacher’s generalization over the baseline (std. fine-tuning) while inheriting only 2% of its memorization. This shows the student learns the teacher’s general capabilities, but rejects majority of the examples the teacher exclusively memorized.

23.01.2026 20:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Excited to share my work at Meta.

Knowledge Distillation has been gaining traction for LLM utility. We find that distilled models don't just improve performance, they also memorize significantly less training data than standard fine-tuning (reducing memorization by >50%). 🧡

23.01.2026 20:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

very cool work!!!

07.01.2026 23:07 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

Syntax that spuriously correlates with safe domains can jailbreak LLMs - e.g. below with GPT4o mini

Our paper (co w/ Vinith Suriyakumar) on syntax-domain spurious correlations will appear at #NeurIPS2025 as a ✨spotlight!

+ @marzyehghassemi.bsky.social, @byron.bsky.social, Levent Sagun

24.10.2025 16:23 β€” πŸ‘ 6    πŸ” 4    πŸ’¬ 3    πŸ“Œ 1
Preview
CS PhD Statements of Purpose cs-sop.org is a platform intended to help CS PhD applicants. It hosts a database of example statements of purpose (SoP) shared by previous applicants to Computer Science PhD programs.

It is PhD application season again πŸ‚ For those looking to do a PhD in AI, these are some useful resources πŸ€–:

1. Examples of statements of purpose (SOPs) for computer science PhD programs: cs-sop.org [1/4]

01.10.2025 20:37 β€” πŸ‘ 9    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

"AI slop" seems to be everywhere, but what exactly makes text feel like "slop"?

In our new work (w/ @tuhinchakr.bsky.social, Diego Garcia-Olano, @byron.bsky.social ) we provide a systematic attempt at measuring AI "slop" in text!

arxiv.org/abs/2509.19163

🧡 (1/7)

24.09.2025 13:21 β€” πŸ‘ 31    πŸ” 16    πŸ’¬ 1    πŸ“Œ 1
Preview
TALKIN' 'BOUT AI GENERATION: COPYRIGHT AND THE GENERATIVE-AI SUPPLY CHAIN | The Copyright Society We know copyright

After 2 years in press, it's published!

"Talkin' 'Bout AI Generation: Copyright and the Generative-AI Supply Chain," is out in the 72nd volume of the Journal of the Copyright Society

copyrightsociety.org/journal-entr...

written with @katherinelee.bsky.social & @jtlg.bsky.social (2023)

10.09.2025 19:08 β€” πŸ‘ 14    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0

it was soo fun!

30.07.2025 04:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Excited to be attending ACL in Vienna next week! I’ll be co-presenting a poster with Niloofar Mireshghallah on our recent PII memorization work on July 29 16:00-17:30 Session 10 Hall 4/5 (& at LLM memorization workshop)!

If you would like to chat memorization/privacy/safety/, please reach out :)

22.07.2025 04:38 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Big congratulations!! 🎊

22.07.2025 04:35 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

congrats!! 🎊

16.05.2025 17:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

big thanks to my wonderful co-authors Matthew Jagielski @katherinelee.bsky.social Niloofar Mireshghallah @dasmiq.bsky.social Christopher A. Choquette-Choo!!

15.05.2025 18:01 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Privacy Ripple Effects has been accepted to the Findings of ACL 2025! πŸŽ‰

See you in Vienna! #ACL2025

15.05.2025 17:24 β€” πŸ‘ 13    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0
Post image

Very excited to be joining Meta GenAI as a Visiting Researcher starting this June in New York City!πŸ—½ I’ll be continuing my work on studying memorization and safety in language models.

If you’re in NYC and would like to hang out, please message me :)

15.05.2025 03:18 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 1

πŸ˜‚πŸ˜‚

15.05.2025 02:20 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

I am at CHI this week to present my poster (Framing Health Information: The Impact of Search Methods and Source Types on User Trust and Satisfaction in the Age of LLMs) on Wednesday April 30

CHI Program Link: programs.sigchi.org/chi/2025/pro...

Looking forward to connecting with you all!

29.04.2025 00:50 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image

4/26 at 3pm:

'Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon'
USVSN Sai Prashanth Β· @nsaphra.bsky.social et al

Submission: openreview.net/forum?id=3E8...

25.04.2025 17:28 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

Bummed to be missing ICLR, but if you’re interested in all things memorization, stop by poster #200 Hall 3 + Hall 2B on April 26 3-5:30 pm and chat with several of my awesome co-authors.

We propose a taxonomy for different types of memorization in LMs. Paper: openreview.net/pdf?id=3E8YN...

21.04.2025 19:22 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

@jaydeepborkar is following 20 prominent accounts