Somnath Basu Roy Chowdhury @somnathbrc

(9/n) Finally, I would like to thank all my amazing co-authors: Avinava, @abeirami.bsky.social , Rahul, Nicholas, Amr, Snigdha.

cc @unccs.bsky.social

02.04.2025 16:03 — 👍 1 🔁 0 💬 1 📌 0

[Somnath Basu Roy Chowdhury]Blogs

(8/n) Here is a blog post with a simplified overview of our work: www.cs.unc.edu/~somnath/blo...

Code: github.com/brcsomnath/pef
Paper link: arxiv.org/abs/2503.20098

02.04.2025 16:03 — 👍 2 🔁 0 💬 1 📌 0

(7/n) We would like to highlight previous great works, like LEACE, that perfectly erase concepts to protect against linear adversaries. In our work, we improve upon this method and present a technique that can protect against any adversary.

x.com/norabelrose/...

02.04.2025 16:03 — 👍 2 🔁 0 💬 1 📌 0

(6/n) We also visualize the learned representations from different erasure methods. We observe that PEF perfectly erasure group (or concept) information without losing other information (collapsing the representation space).

02.04.2025 16:03 — 👍 1 🔁 0 💬 1 📌 0

(5/n) Empirically, we observe that PEF reaches the theoretical limits of erasure even in challenging settings where other methods struggle, including both linear (INLP, LEACE) and non-linear techniques (FaRM, KRaM).

02.04.2025 16:03 — 👍 1 🔁 0 💬 1 📌 0

(4/n) When the distributions are unequal, we still achieve perfect erasure but with a slightly reduced utility. The erasure function in this setting is shown below.

02.04.2025 16:03 — 👍 1 🔁 0 💬 1 📌 0

(3/n) From the above limits, we show that optimally perfect concept erasure is only feasible when the underlying distributions are equal up to permutations. In such scenarios, the erasure function is shown in the diagram.

02.04.2025 16:03 — 👍 1 🔁 0 💬 1 📌 0

(2/n) We study the fundamental limits of concept erasure. Borrowing from the work of @FlavioCalmon et al in information theory literature, we characterize the erasure capacity and maximum utility that can be retained during concept erasure.

02.04.2025 16:03 — 👍 2 🔁 0 💬 1 📌 0

𝐇𝐨𝐰 𝐜𝐚𝐧 𝐰𝐞 𝐩𝐞𝐫𝐟𝐞𝐜𝐭𝐥𝐲 𝐞𝐫𝐚𝐬𝐞 𝐜𝐨𝐧𝐜𝐞𝐩𝐭𝐬 𝐟𝐫𝐨𝐦 𝐋𝐋𝐌𝐬?

Our method, Perfect Erasure Functions (PEF), erases concepts perfectly from LLM representations. We analytically derive PEF w/o parameter estimation. PEFs achieve pareto optimal erasure-utility tradeoff backed w/ theoretical guarantees. #AISTATS2025 🧵

02.04.2025 16:03 — 👍 39 🔁 8 💬 2 📌 3

Please stop by our posters if you’re interested. Feel free to reach out if you're interested in AI safety, efficiency, and just want to chat!

CC: @unccs.bsky.social

06.12.2024 19:24 — 👍 0 🔁 0 💬 0 📌 0

(3/3) 𝐓𝐨𝐰𝐚𝐫𝐝𝐬 𝐒𝐜𝐚𝐥𝐚𝐛𝐥𝐞 𝐄𝐱𝐚𝐜𝐭 𝐌𝐚𝐜𝐡𝐢𝐧𝐞 𝐔𝐧𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 𝐔𝐬𝐢𝐧𝐠 𝐏𝐄𝐅𝐓

I’m also presenting my ongoing unlearning work at SafeGenAI Workshop. This uses a novel PEFT training approach to improve exact unlearning efficiency

arxiv.org/abs/2406.16257

06.12.2024 19:24 — 👍 1 🔁 0 💬 1 📌 0

(2/3) 𝐅𝐚𝐬𝐭 𝐓𝐫𝐞𝐞-𝐅𝐢𝐞𝐥𝐝 𝐈𝐧𝐭𝐞𝐠𝐫𝐚𝐭𝐨𝐫

An efficient method for graph field integration (a special case of matrix-vector mult.) using integrator trees. FTFI enables polylog-lin. time multiplication w/ performance boost in vision transformers

arxiv.org/abs/2406.15881

06.12.2024 19:24 — 👍 0 🔁 0 💬 1 📌 0

🚨I’m traveling to #NeurIPS2024 next week to present these papers.

(1/3) 𝐒𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞𝐝 𝐔𝐧𝐫𝐞𝐬𝐭𝐫𝐢𝐜𝐭𝐞𝐝-𝐑𝐚𝐧𝐤 𝐌𝐚𝐭𝐫𝐢𝐜𝐞𝐬 𝐟𝐨𝐫 𝐏𝐄𝐅𝐓

A new PEFT method replacing low-rank matrices (LoRA) with more expressive structured matrices

arxiv.org/abs/2406.17740

06.12.2024 19:24 — 👍 6 🔁 1 💬 1 📌 0

Please stop by our posters if you’re interested. Feel free to reach out if you're interested in AI safety, efficiency, and just want to chat!

CC: @unccs.bsky.social

06.12.2024 19:15 — 👍 0 🔁 0 💬 0 📌 0

06.12.2024 19:15 — 👍 0 🔁 0 💬 1 📌 0

Somnath Basu Roy Chowdhury

Latest posts by somnathbrc.bsky.social on Bluesky

@somnathbrc is following 20 prominent accounts