Paper
in
Workshop: 21th Workshop on Perception Beyond the Visible Spectrum (PBVS'2025)
SwinPaste: A Swin Transformer-Based Framework for RGB-Guided Thermal Image Super-Resolution
Hang Zhong, Yu Wang, Shengjie Zhao
Thermal imaging holds a pivotal role across diverse applications, yet its efficacy is constrained by the inherent low resolution of widely accessible infrared (IR) cameras. Traditional super-resolution (SR) techniques frequently encounter challenges when applied to thermal images, primarily due to their scarcity of high-frequency details. To mitigate this, guided SR techniques harness information from a high-resolution image, typically captured in the visible spectrum, to facilitate the reconstruction of a high-resolution IR image from its low-resolution input. Inspired by SwinFuSR, we propose SwinPaste, an RGB-guided thermal image super-resolution model based on the Swin Transformer. Firstly, we introduce a data mixing strategy during pre-training to enhance data diversity and improve model robustness. Furthermore, we employ multi-scale supervised signals to effectively recover high-frequency details, ensuring superior reconstruction quality. Our proposed method achieves 30.94 PSNR and 0.9201 SSIM at ×8 scale, and 26.33 PSNR and 0.8593 SSIM at ×16 scale on PBVS 2025 dataset, ranking the second place in Track 2 of the PBVS 2025 TISR Challenge.