April 04, 2024
Text-to-image diffusion models have been shown to suffer from sample-level memorization, possibly reproducing near-perfect replica of images that they are trained on, which may be undesirable. To remedy this issue, we develop the first differentially private (DP) retrieval-augmented generation algorithm that is capable of generating high-quality image samples while providing provable privacy guarantees. Specifically, we assume access to a text-to-image diffusion model trained on a small amount of public data, and design a DP retrieval mechanism to augment the text prompt with samples retrieved from a private retrieval dataset. Our differentially private retrieval-augmented diffusion model (DP-RDM) requires no fine-tuning on the retrieval dataset to adapt to another domain, and can use state-of-the-art generative models to generate high-quality image samples while satisfying rigorous DP guarantees. For instance, when evaluated on MS-COCO, our DP-RDM can generate samples with a privacy budget of ϵ=10, while providing a 3.5 point improvement in FID compared to public-only retrieval for up to 10,000 queries.
Written by
Jonathan Lebensold
Pietro Astolfi
Kamalika Chaudhuri
Publisher
arXiv
Research Topics
Core Machine Learning
April 04, 2025
Olga Golovneva, Tianlu Wang, Jason Weston, Sainbayar Sukhbaatar
April 04, 2025
January 02, 2025
Shukai Duan, Heng Ping, Nikos Kanakaris, Xiongye Xiao, Panagiotis Kyriakis, Nesreen K. Ahmed, Peiyu Zhang, Guixiang Ma, Mihai Capota, Shahin Nazarian, Theodore L. Willke, Paul Bogdan
January 02, 2025
December 18, 2024
Haider Al-Tahan, Quentin Garrido, Randall Balestriero, Diane Bouchacourt, Caner Hazirbas, Mark Ibrahim
December 18, 2024
December 12, 2024
December 12, 2024
Foundational models
Our approach
Latest news
Foundational models