Beyond [CLS] Token: Query-Driven Token-Level Forgery Purification for Generalizable Deepfake Detection
Abstract
We revisit the feature learning process of state-of-the-art deepfake detectors that leverage ViT-based vision foundation models and discover that the [\texttt{CLS}] token, commonly adopted for detection, suffers from the Pre-trained Information Bias (PIB), \textit{i.e.}, it tends to mainly focus on global semantics due to the knowledge dominated by pre-trained model parameters, while struggling to emphasize subtle local forgery cues. To overcome this limitation, one potential way is incorporating the token-level features to reform a new detection-specific token. To this end, we propose Query-Driven Token-Level Forgery Purification (QTFP) framework enabling the model to better capture local forgery traces without losing useful pre-trained prior. Specifically, we first introduce randomly initialized, learnable query tokens independent of the backbone and prior knowledge, which can effectively aggregate multi-patch evidence into a global token for detection. To make query tokens focusing on meaningful regions, we propose a theoretical fake-likelihood contrastive learning loss, which employs a weighting strategy to highlight significant fake regions while diminishing the impact of real-like patches. Using SNR theory, we verify that the designed weight is both reliable and informative. To further maintain useful authentic information, a real-attention alignment constraint is applied to query tokens. These designs go beyond relying solely on the [\texttt{CLS}] token by jointly reorganizing real and fake information across all tokens, which successfully enhance detector robustness. Extensive experiments on diverse datasets demonstrate the effectiveness of our method.