Wise words from a fellow scholarβ¦
20.11.2025 14:57 β π 0 π 0 π¬ 0 π 0@poonpura.bsky.social
M.S. Computer Science @Stanford. Interested in machine learning privacy, AI security, diffusion models, cryptography, AI for environment, healthcare, education π± poonpura.github.io
Wise words from a fellow scholarβ¦
20.11.2025 14:57 β π 0 π 0 π¬ 0 π 0πππ
27.11.2024 18:48 β π 0 π 0 π¬ 0 π 0π I am also applying for PhD programs this Fall! If you think I am a good fit for your lab, please contact me at pura@stanford.edu π
27.11.2024 18:42 β π 2 π 0 π¬ 0 π 0For details, check out our paper (feedback appreciated!):
π: arxiv.org/abs/2411.14639
π: big thank you to my collaborators and mentors Wei-Ning Chen, @berivanisik.bsky.social, Sanmi Koyejo, Albert No
π§΅ 16/16
We tried generating images using different values of subsample size (m) and DP parameter Ξ΅. Our results were particularly good for Textual Inversion (TI)!
π§΅ 15/16
We tested the effectiveness of our approach on two different target datasets: a collection of artworks from an artist (with consent, see her art on Instagram: @eveismyname) and the Paris 2024 Olympic pictograms (approved for non-commercial editorial use, Β©οΈIOC - 2023)
π§΅ 14/16
By only aggregating over a smaller sample of the target embeddings, we can enhance the strength of our DP guarantees. This allows us to achieve the same privacy guarantees with much less noise, and hence much better image quality! β¨
π§΅13/16
For a bigger privacy-utility boost, we can also introduce subsampling. [1]
[1] arxiv.org/abs/2210.00597
π§΅ 12/16
4. Apply noisy aggregated embedding to Style Guidance or Textual Inversion π₯
5. Serve and enjoy! π΄
For details, see our paper:
π: arxiv.org/abs/2411.14639
π§΅ 11/16
Our recipe can be summarized as follows: π³
1. Obtain an embedding vector for each image in the target dataset πΏ
2. Aggregate the embeddings to limit sensitivity to individual image π₯£
3. Add DP noise using the Gaussian mechanism π§
π§΅ 10/16
2. Textual Inversion [1] (use the target dataset to train a new token embedding vector that is later used in the text prompt during image generation)
[1] arxiv.org/abs/2208.01618
π§΅ 9/16
1. Universal Guidanceβs CLIP style guidance [1] (guide image towards target CLIP embedding during image generation)
[1] arxiv.org/abs/2302.07121
π§΅ 8/16
But here, we propose a new approach using embedding vectors.
Our work focuses on applying DP to known diffusion model adaptation approaches that involve encoding the target dataset into an embedding vector, including:
π§΅7/16
We therefore turn to other DP approaches that donβt require full training using DP-SGD. Some work has been done on this, such as DP-LoRA [1] (utilizing Low-Rank Adaptation) and DP-RDM [2] (utilizing Retrieval Augmented Generation).
[1] arxiv.org/abs/2110.06500
[2] arxiv.org/abs/2403.14421
π§΅ 6/16
But while DP-SGD is powerful, it struggles with:
1. High computational costs
2. Incompatibility with batch normalization
3. Severe degradation in image quality
π§΅5/16
The first solution that comes to mind is differential privacy (DP), which adds noise to provide data privacy. DP-SGD [1] is particularly popular for neural networks, and work has been done to adapt DP-SGD to diffusion models.
[1] arxiv.org/abs/1607.00133
π§΅ 4/16
This means the model might directly recreate training images instead of generalizing patterns. This poses copyright concerns for artists and privacy issues for sensitive datasets.Β©οΈ
π§΅ 3/16
Diffusion models like Stable Diffusion have revolutionized image generation and can be personalized on smaller datasets to capture specific objects or styles. But personalizing on small datasets risks memorization.
π§΅ 2/16
How might we get a diffusion model to βlearnβ an art style without copying specific artworks? π¨
π§΅Letβs find out! (1/16)
π: arxiv.org/abs/2411.14639