DUO-VSR: Dual-Stream Distillation for One-Step Video Super-Resolution
Abstract
Diffusion-based video super-resolution (VSR) achieves remarkable fidelity but suffers from prohibitive sampling cost. While distribution matching distillation (DMD) accelerates diffusion models to one-step generation, directly applying it to VSR leads to training instability and degraded, insufficient supervision.To address these issues, we propose \textbf{DUO-VSR}, a three-stage framework centered on a \textbf{DU}al-Stream Distillation strategy that integrates distribution matching and adversarial supervision for \textbf{O}ne-step VSR.We first adopt a Progressive Guided Distillation Initialization to stabilize subsequent training through trajectory-preserving distillation.We then introduce a Dual-Stream Distillation Strategy to jointly optimize DMD and Real–Fake Score Feature GAN (RFS-GAN) streams, with the latter providing complementary adversarial supervision using features from both real and fake score models.Finally, a Preference-Guided Refinement aligns the student with perceptual quality preferences.Comprehensive experiments demonstrate that DUO-VSR achieves superior visual quality and efficiency over previous one-step VSR methods.