DVAR: Dynamic Visual Autoregressive Modeling for Image Super-Resolution
Abstract
Next-scale prediction paradigm visual autoregressive (VAR) models have demonstrated significant potential for image super-resolution. However, their practical application is constrained by a rigid, size-specific design. This limitation stems from their reliance on memorizing fixed, absolute scaling schedules, which necessitates a distinct model for each target resolution. We introduce DVAR, a Dynamic Visual AutoRegressive framework that overcomes this fundamental bottleneck. Instead of memorizing these rigid schedules, DVAR learns a canonical scaling dynamic. This dynamic effectively decouples the logic of relative scaling from the absolute target size, thereby preserving a single set of proportions between generative steps that can be applied uniformly to any size. Furthermore, we introduce a dynamic sampling scheduler to mitigate the teacher-forcing problem with negligible computational overhead. By leveraging the geometric proximity of visual tokens in the codebook, it efficiently simulates the model's predictive error distribution to bridge the training-inference gap. To our knowledge, DVAR is the first framework to grant VAR models size-flexibility, breaking their one-to-one dependency on a fixed resolution. Extensive evaluations demonstrate that DVAR achieves superior visual quality over existing Real-ISR methods, proving that a flexible, purely autoregressive approach is a viable path to state-of-the-art image super-resolution.