Thermal Diffusion Matters: Infrared Spatial-Temporal Video Super-Resolution through Heat Conduction Priors
Mingxuan Zhou ⋅ Shuang Li ⋅ Yutang Zhang ⋅ Jing Geng ⋅ Yirui Shen ⋅ Jingxuan Kang ⋅ Fuzhen Zhuang ⋅ Shuigen Wang
Abstract
Infrared video acquisition inherently suffers from low spatial resolution and limited frame rates due to the physical constraints of thermal imaging sensors. These limitations make infrared video enhancement uniquely challenging, as it requires restoring spatial details and temporal continuity from highly undersampled thermal signals. To address this challenge, we propose `THERIS`, a unified **THER**mal-physics inspired framework for **I**nfrared spatial-temporal video **S**uper-resolution. Grounded in the physical principles of thermal diffusion, `THERIS` leverages heat conduction dynamics that govern the spatiotemporal evolution of infrared pixel intensities. Specifically, the proposed Thermal Diffusion Interpolation Module (TDIM) treats temporal feature sequences as one-dimensional heat fields and performs frequency-domain diffusion to synthesize temporally coherent intermediate frames. Building on this foundation, the Thermo-Aware State Space Module (TSSM) refines spatiotemporal representations through learnable spectral filtering and selective state-space modeling, while maintaining consistency guided by the thermodynamic prior inherited from TDIM. Additionally, a Temperature Field Modeling Loss is introduced to enforce adherence to the heat conduction equation, promoting temporal coherence and spatial stability in the generated results. Extensive experiments demonstrate that `THERIS` achieves state-of-the-art performance while producing visually coherent results. To facilitate further research in the infrared video processing domain, we also introduce **IRVAL**, a high-resolution dataset comprising 108,512 video frames at 512$\times$512 resolution.
Successful Page Load