AdvFM: Lookahead Flow-Matching Velocity-Field Attacks for Imperceptible and Transferable Adversarial Examples
Runze Liu ⋅ Zeyue Wang ⋅ Fanghui Sun ⋅ Rui Liu ⋅ Yihan Yan ⋅ Shen Wang ⋅ Zhaoyang Zhang
Abstract
Unrestricted adversarial attacks based on generative models typically operate either directly in image space or through diffusion-style denoising and re-noising, which limits transferability and robustness against defenses. We revisit this problem through the lens of flow matching and continuous-time velocity fields, and propose AdvFM, a velocity-field attack that injects adversarial signals into the flow-matching dynamics instead of the pixel space. Given a noisy state $x_t$, AdvFM perturbs the reconstruction at $t{=}1$ and converts this perturbation into a change of the velocity field, yielding a state update that amplifies the inner PGD step in the noisy space. We further introduce a lookahead variant that optimizes a two-point objective over the current and rolled-out reconstructions, reducing temporal mismatch along the ODE trajectory. From a theoretical perspective, we show that compared to diffusion-based attacks, AdvFM enjoys: (i) larger single-step increases in the black-box loss via step amplification, (ii) reduced gradient variance and stronger surrogate-target alignment due to Gaussian smoothing, enhancing its transferability, and (iii) perturbations that concentrate in robust-tangent directions, thereby aligning with robust gradients of adversarially trained models and surviving purification more effectively; the lookahead variant further lowers gradient noise for a two-point robust objective. Extensive experiments demonstrate that AdvFM achieves promising performance in both black-box transferability and a suite of adversarial training and purification defenses.
Successful Page Load