Skip to yearly menu bar Skip to main content


Poster

I2VGuard: Safeguarding Images against Misuse in Diffusion-based Image-to-Video Models

Dongnan Gui · Xun Guo · Wengang Zhou · Yan Lu


Abstract:

Recent advances in image-to-video generation have enabled animation of still images and offered pixel-level controllability. While these models hold great potential to transform single images into vivid and dynamic videos, they also carry risks of misuse that could impact privacy, security, and copyright protection. This paper proposes a novel approach that applies imperceptible perturbations on images to degrade the quality of the generated videos, thereby protecting images from misuse in white-box image-to-video diffusion models. Specifically, we function our approach as an adversarial attack, incorporating spatial, temporal, and diffusion attack modules. The spatial attack shifts image features from their original distribution to a lower-quality target distribution, reducing visual fidelity. The temporal attack disrupts coherent motion by interfering with temporal attention maps that guide motion generation. To enhance the robustness of our approach across different models, we further propose a diffusion attack module leveraging contrastive loss. Our approach can be easily integrated with mainstream diffusion-based I2V models. Extensive experiments on SVD, CogVideoX, and ControlNeXt demonstrate that our method significantly impairs generation quality in terms of visual clarity and motion consistency, while introducing only minimal artifacts to the images. To the best of our knowledge, we are the first to explore adversarial attacks on image-to-video generation for security purposes.

Live content is unavailable. Log in and register to view live content