NIPQ: Noise Proxy-Based Integrated Pseudo-Quantization

Juncheol Shin · Junhyuk So · Sein Park · Seungyeop Kang · Sungjoo Yoo · Eunhyeok Park

West Building Exhibit Halls ABC 367


Straight-through estimator (STE), which enables the gradient flow over the non-differentiable function via approximation, has been favored in studies related to quantization-aware training (QAT). However, STE incurs unstable convergence during QAT, resulting in notable quality degradation in low-precision representation. Recently, pseudo-quantization training has been proposed as an alternative approach to updating the learnable parameters using the pseudo-quantization noise instead of STE. In this study, we propose a novel noise proxy-based integrated pseudo-quantization (NIPQ) that enables unified support of pseudo-quantization for both activation and weight with minimal error by integrating the idea of truncation on the pseudo-quantization framework. NIPQ updates all of the quantization parameters (e.g., bit-width and truncation boundary) as well as the network parameters via gradient descent without STE instability, resulting in greatly-simplified but reliable precision allocation without human intervention. Our extensive experiments show that NIPQ outperforms existing quantization algorithms in various vision and language applications by a large margin.

Chat is not available.