RDF-MIG: A Robust Diffusion Framework for Masked Image Generation to Augment Semantic Segmentation and Change Detection
Abstract
Change detection and semantic segmentation are key techniques for satellite image analysis in remote sensing. However, acquiring high-quality labeled data is costly and time-consuming. Although recent studies have explored generative models to ease data scarcity, a unified framework supporting both tasks is still lacking, and most methods overlook noise accumulation and cannot generate multispectral images. To address this, we propose the robust diffusion framework for masked image generation (RDF-MIG). RDF-MIG generates bi-temporal change-labeled and single-temporal segmentation-labeled images to enhance downstream change detection and semantic segmentation tasks. Furthermore, to address noise accumulation and improve the quality of generated image–mask pairs, we reformulate the diffusion model training objective by proposing the Maximum Correntropy Robust Diffusion (MCRD) loss, and further design an MSE-consistency calibration that analytically aligns small-error gradients with the MSE objective while preserving robustness to outliers. Experiments indicate that the proposed RDF-MIG framework can generate multispectral image–mask pairs to improve downstream performance, while MCRD loss further enhances the quality of the synthesized data.