Poster
Enhancing Adversarial Transferability with Checkpoints of a Single Model’s Training
Shixin Li · Chaoxiang He · Xiaojing Ma · Bin Benjamin Zhu · Shuo Wang · Hongsheng Hu · Dongmei Zhang · Linchen Yu
Adversarial attacks threaten the integrity of deep neural networks (DNNs), particularly in high-stakes applications. This paper explores an innovative black-box adversarial attack strategy leveraging checkpoints from a single model’s training trajectory. Unlike traditional ensemble attacks that require multiple surrogate models of different architectures, our approach utilizes a single model’s diverse training checkpoints to craft adversarial examples. By categorizing the knowledge learned during training into task-intrinsic and task-irrelevant knowledge, we identify checkpoints that predominantly capture task-intrinsic knowledge, which generalizes across different models. We introduce an accuracy gap-based selection strategy to enhance the transferability of adversarial examples to models with different architectures. Extensive experiments on benchmark datasets, including ImageNet and CIFAR-10, demonstrate that our method consistently outperforms traditional model ensemble attacks in terms of transferability. Furthermore, our approach remains highly effective even with significantly reduced training data, offering a practical and resource-efficient solution for highly transferable adversarial attacks.
Live content is unavailable. Log in and register to view live content