Katherine Lee @katherinelee

Extracting memorized pieces of (copyrighted) books from open-weight language models Plaintiffs and defendants in copyright lawsuits over generative AI often make sweeping, opposing claims about the extent to which large language models (LLMs) have memorized plaintiffs' protected expr...

Llama 3.1 70B contains copies of nearly the entirety of some books. Harry Potter is just one of them. I don’t know if this means it’s an infringing copy. But the first question to answer is if it’s a copy at all/in the first place. That’s what our new results suggest:

arxiv.org/abs/2505.12546

21.05.2025 11:20 — 👍 53 🔁 24 💬 4 📌 4

Come chat about unlearning with us!!

02.04.2025 16:57 — 👍 5 🔁 1 💬 0 📌 0

Small robot smoking and waving with their right hand

We’ve been receiving a bunch of questions about a CFP for GenLaw 2025.

We wanted to let you know that we chose not to submit a workshop proposal this year (we need a break!!). We’ll be at ICML though and look forward to catching up there!

You can watch our prior videos!

09.03.2025 20:33 — 👍 5 🔁 2 💬 2 📌 0

Career Update: Google DeepMind -> Anthropic TODO

Nicholas is leaving GDM at the end of this week, and we're feeling big sad about it: nicholas.carlini.com/writing/2025...

05.03.2025 21:56 — 👍 5 🔁 2 💬 0 📌 0

📢 The First Workshop on Large Language Model Memorization (L2M2) will be co-located with
@aclmeeting.bsky.social in Vienna 🎉

💡 L2M2 brings together researchers to explore memorization from multiple angles. Whether it's text-only LLMs or Vision-language models, we want to hear from you! 🌍

27.01.2025 21:50 — 👍 11 🔁 3 💬 1 📌 3

4th ACM Symposium on Computer Science & Law (CS&Law 2025). <div class="ag87-crtemvc-hsbk"><div class="css-vsf5of"><p style="text-align:center;" class="carina-rte-public-DraftStyleDefault-block">The ACM Symposium on Computer .

Registration for CSLaw 2025 is now open! Please share far and wide!

Early bird prices are available until February 24. The main conference will begin March 25!

Register here: web.cvent.com/event/dbf97d...

02.02.2025 13:27 — 👍 3 🔁 3 💬 1 📌 0

Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice We articulate fundamental mismatches between technical methods for machine unlearning in Generative AI, and documented aspirations for broader impact that these methods could have for law and policy. ...

New paper on why machine "unlearning" is much harder than it seems is now up on arXiv: arxiv.org/abs/2412.06966 This was a huuuuuge cross-disciplinary effort led by @msftresearch.bsky.social FATE postdoc @grumpy-frog.bsky.social!!!

14.12.2024 00:55 — 👍 73 🔁 24 💬 2 📌 0

My paper with @jtlg.bsky.social, Daniel Ho, A. Feder Cooper, and a host of computer science folks on the limits of AI "unlearning" of data and content is now posted on Arxiv

arxiv.org/abs/2412.06966

11.12.2024 19:46 — 👍 10 🔁 5 💬 0 📌 1

Katherine Lee

Latest posts by katherinelee.bsky.social on Bluesky

@katherinelee is following 20 prominent accounts