Skip to yearly menu bar Skip to main content


Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model

Zhicai Wang · Longhui Wei · Tan Wang · Heyu Chen · Yanbin Hao · Xiang Wang · Xiangnan He · Qi Tian

Arch 4A-E Poster #254
[ ]
Thu 20 Jun 5 p.m. PDT — 6:30 p.m. PDT


Text-to-image (T2I) generative models, in recent times, have given rise to various applications owing to their ability to generate high-fidelity, photo-realistic images. However, the question of how to utilize T2I models in fundamental image classification remains open. Currently, a common method to enhance image classification involves expanding the training set with synthetic datasets generated by T2I models. In this study, we explore the drawbacks and effectiveness of two main expansion methods, namely, distillation-based and data augmentation methods. Our findings indicate that these methods face challenges in generating both faithful and diverse images for domain-specific concepts. To address this issue, we propose a novel inter-class data augmentation method, Diff-Mix. Diff-Mix expands the dataset by conducting image translation in an inter-class manner, significantly improving the diversity of synthetic data. We observe an improved trade-off between faithfulness and diversity with Diff-Mix, resulting in a significant performance gain across various image classification settings, including few-shot classification, conventional classification, and long-tail classification, particularly for domain-specific datasets.

Live content is unavailable. Log in and register to view live content