Exposing Functional Fusion: A New Class of Strategic Backdoor in Dynamic Prompt Architectures
Abstract
Existing ViT backdoor attacks based on backbone-overwriting full-tuning are computationally expensive and inflict performance degradation. This has forced adversaries towards the Visual Parameter-Efficient Fine-Tuning (PEFT) paradigm, dominated by adapter-based (e.g., LoRA) and prompt-based (e.g., VPT) approaches. While adapter security has seen initial study, the risks of the burgeoning prompt-based ecosystem remain critically unexplored. We fill this critical gap, exposing how the evolution of VPT towards dynamic, context-aware architectures innately creates a far more dangerous, emergent threat. This vulnerability arises even though these dynamic modules unlock superior benign performance. We propose VIPER, an attack framework built on a lightweight, dynamic Visual Prompt Generator (VPG) that demonstrates this vulnerability. Critically, this dynamic architecture enables Functional Fusion: an emergent phenomenon where malicious logic and benign task utility are inseparably fused into the same sparse, high-magnitude parameter core. This fusion creates an unsolvable ``hostage" dilemma, as pruning the attack necessarily destroys the benign performance.Comprehensive evaluations show VIPER resolves the attacker's trilemma: VIPER not only achieves state-of-the-art performance on clean data, but also maintains near-100\% ASR even under 90\% VPG-module pruning (where LoRA attacks collapse), while adding only an imperceptible 0.06ms (1.16\%) of inference latency. VIPER's results, driven by Functional Fusion, expose a new, paradigm-level risk in dynamic prompt architectures.