Skip to yearly menu bar Skip to main content


Poster

Enhancing Diversity for Data-free Quantization

Kai Zhao · zhihao zhuang · Miao Zhang · Chenjuan Guo · Yang Shu · Bin Yang


Abstract:

Model quantization is an effective way to compress deep neural networks and accelerate the inference time on edge devices. Existing quantization methods usually require original data for calibration during the compressing process, which may be inaccessible due to privacy issues. A common way is to generate calibration data to mimic the origin data. However, the generators in these methods have the mode collapse problem, making them unable to synthesize diverse data. To solve this problem, we leverage the information from the full-precision model and enhance both inter-class and intra-class diversity for generating better calibration data, by devising a multi-layer features mixer and normalization flow based attention. Besides, novel regulation losses are proposed to make the generator produce diverse data with more patterns from the perspective of activated feature values and for the quantized model to learn better clip ranges adaptive to our diverse calibration data. Extensive experiments show that our method achieves state-of-the-art quantization results for both Transformer and CNN architectures. In addition, we visualize the generated data to verify that our strategies can effectively handle the mode collapse issue. Our codes are available at https://anonymous.4open.science/r/DFQ-84E6 and will be publicly available.

Live content is unavailable. Log in and register to view live content