Skip to yearly menu bar Skip to main content


DiffForensics: Leveraging Diffusion Prior to Image Forgery Detection and Localization

Zeqin Yu · Jiangqun Ni · Yuzhen Lin · Haoyi Deng · Bin Li

Arch 4A-E Poster #303
[ ]
Thu 20 Jun 10:30 a.m. PDT — noon PDT


As manipulating images may lead to misinterpretation of the visual content, addressing the image forgery detection and localization (IFDL) problem has drawn serious public concerns. In this work, we propose a simple assumption that the effective forensic method should focus on the mesoscopic properties of images. Based on the assumption, a novel two-stage self-supervised framework leveraging the diffusion model for IFDL task, \ie, DiffForensics, is proposed in this paper. The DiffForensics begins with self-supervised denoising diffusion paradigm equipped with the module of encoder-decoder structure, by freezing the pre-trained encoder (\eg, in ADE-20K) to inherit macroscopic features for general image characteristics, while encouraging the decoder to learn microscopic feature representation of images, enforcing the whole model to focus the mesoscopic representations. The pre-trained model as a prior, is then further fine-tuned for IFDL task with the customized Edge Cue Enhancement Module (ECEM), which progressively highlights the boundary features within the manipulated regions, thereby refining tampered area localization with better precision. Extensive experiments on several public challenging datasets demonstrate the effectiveness of the proposed method compared with other state-of-the-art methods. The proposed DiffForensics could significantly improve the model’s capabilities for both accurate tamper detection and precise tamper localization while concurrently elevating its generalization and robustness.

Live content is unavailable. Log in and register to view live content