Spatial-Spectral Residuals Informed Diffusion Neural Operator for Pan-sharpening
Abstract
Pan-sharpening, a fundamental image preprocessing technique in remote sensing, aims to generate spatially and spectrally enriched multispectral imagery by integrating complementary information from texture-rich panchromatic (PAN) images and paired low-resolution multispectral (LRMS) counterparts. Although recent generative diffusion models have achieved impressive fusion quality, these performance gains often come with substantial computational costs, rendering them impractical for resource-constrained scenarios common in remote sensing applications. This work introduces a function-space diffusion model built upon a neural operator architecture that achieves compelling performance with promising efficiency. Specifically, our framework replaces the standard attention-based denoising backbone with a Galerkin-type neural operator, significantly reducing computational complexity while maintaining excellent representational capacity. Furthermore, by explicitly integrating pixel-wise spatial-spectral consistency residuals into each reverse diffusion step, our method establishes a fine-grained, closed-loop guidance mechanism that dynamically calibrates spatial details and spectral fidelity throughout the generation process. Extensive experiments on multiple benchmark datasets demonstrate the effectiveness of our approach over state-of-the-art methods.