RevINN: An End-to-End Invertible Neural Network for Reversible Adversarial Examples Generation
Abstract
Recent studies have shown that Reversible Adversarial Examples (RAE) can mislead unauthorized deep neural networks while remaining usable for authorized users, effectively preventing image data leakage. Existing RAE methods rely on reversibly embedding perturbation information into the original adversarial examples to enable restoration. However, this two-stage process often results in RAEs with inferior attack effectiveness and visual quality compared to the original versions. To solve these challenges, we propose a novel end-to-end Invertible Neural Network for Reversible Adversarial Examples Generation (RevINN), which directly generates RAEs in one stage by scrambling the intrinsic frequency information of images. Specifically, our RevINN consists of the Cross-Frequency Modulation Attack (CFMA) module and the High-Frequency Perturbation Enhancement (HFPE) module. CFMA selectively exchanges discriminative information between low- and high-frequency wavelet components to achieve adversariality. To fully alter high-frequency semantics, HFPE innovatively employs a tri-branch structure for fine-grained modulation among high-frequency subbands, enhancing perturbation strength. Finally, the modified components are recomposed into RAEs via the inverse wavelet transform. Our RevINN is optimized with adversarial, perceptual, and invertible losses, and can restore images based on the reversibility of the wavelet operations and network modules. Extensive experiments demonstrate that our RevINN achieves state-of-the-art RAE generation quality. The code will be released to the public.