Gradient Knows Best: Mixed-Precision Quantization via Gradient-Guided Bit Allocation for Super-Resolution
Abstract
Although deep learning-based image super-resolution (SR) models have achieved remarkable progress in reconstruction quality, their high computational and memory demands make them unsuitable for lightweight platforms. To address this issue, various quantization techniques have been introduced. Among them, mixed-precision quantization (MPQ) introduces a layer-wise bit-width allocation to balance computational efficiency with reconstruction quality. However, existing MPQ methods based on post-training quantization (PTQ) for SR models face two critical limitations. First, quantization sensitivity estimation using static statistics fails to capture the accurate quantization error induced by each layer, resulting in suboptimal bit allocation. Second, removing batch normalization (BN) to preserve high-frequency details leads to scale inconsistencies across activations, making fixed quantization ranges insufficient to accurately represent their distribution. Therefore, we propose a novel PTQ-based MPQ framework tailored for SR models. Our method estimates the quantization sensitivity of weights and activations by leveraging gradients of the objective function with respect to bit-widths, enabling adaptive layer-wise bit allocation and fast convergence. Additionally, we introduce a dynamic activation range normalization that alleviates the distributional imbalance caused by the absence of BN, ensuring stable quantization under fixed range constraints. Our method outperforms existing PTQ-based methods by 1.26 dB in peak signal-to-noise ratio (PSNR) on the Urban100 dataset and reduces quantization time by ×1.9 for 3-bit quantization of EDSR ×4.