Forensic-Friendly Image Manipulation via Controllable Latent Diffusion
Abstract
With diffusion models demonstrating superior capabilities in image editing, more users now rely on online servers for content manipulation via textual prompts rather than traditional offline tools. Despite servers attempting to prevent the proliferation of maliciously edited content via active defense like watermarking, this approach is not conducive to passive detection by third-party forensics. To address this limitation, we propose a plug-and-play controllable denoising termed Forensic-Friendly Image Manipulation (FFIM), which simultaneously satisfies user editing requirements while facilitating forensic analysis. Specifically, FFIM comprises three phases: Controllable Projection, Implicit Detection, and Explicit Guidance. Phase I enforces orthogonality between the variance of random noise and image features to ensure clear demarcation between the edited and unedited regions. Phase II implicitly evaluates whether this demarcation meets detection requirements; if not, Phase III explicitly introduces a surrogate detection model and adversarially adjusts the random noise to maximize the feature discrepancy between these regions. Experiments across four datasets demonstrate the superiority of FFIM over baseline methods, achieving up to +6.6\% F1 in pixel-level localization and +27.3\% AUC in image-level detection. Importantly, these forensic gains are attained without compromising visual quality, as evidenced by comparable manipulation in both subjective user studies and objective quality assessments. We envision that the proposed method will be widely adopted by generative AI service providers, enabling more comprehensive information authenticity from a passive defensive standpoint.