CVPR Poster In-distribution Public Data Synthesis with Diffusion Models for Differentially Private Image Classification

Poster

In-distribution Public Data Synthesis with Diffusion Models for Differentially Private Image Classification

Jinseong Park · Yujin Choi · Jaewook Lee

Arch 4A-E Poster #253

[ Abstract ] [ Project Page ] [ Paper PDF ]

[Paper PDF]

Abstract: To alleviate the utility degradation of deep learning image classification with differential privacy (DP), employing extra public data or pre-trained models has been widely explored. Recently, the use of in-distribution public data has been investigated, where a tiny subset of data owners share their data publicly. In this paper, we investigate a framework that leverages recent diffusion models to amplify the information of public data. Subsequently, we identify data diversity and generalization gap between public and private data as critical factors addressing the limited size of public data. While assuming 4\% of training data as public, our method achieves 85.48\% on CIFAR-10 without using pre-trained models, with a privacy budget of $(2,10^{-5})$.

Chat is not available.