Poster
FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning
Gaojian Wang · Feng Lin · Tong Wu · Zhenguang Liu · Zhongjie Ba · Kui Ren
[
Abstract
]
[ Project Page ]
Abstract:
This work asks: with abundant, unlabeled real faces, how to learn a robust and transferable facial representation that boosts various face security tasks with respect to generalization performance? We make the first attempt and propose a self-supervised pretraining framework to learn fundamental representations of real face images, , that leverages the synergy between masked image modeling (MIM) and instance discrimination (ID). We explore various facial masking strategies for MIM and present a simple yet powerful CRFR-P masking, which explicitly forces the model to capture meaningful intra-region onsistency and challenging inter-region oherency. Furthermore, we devise the ID network that naturally couples with MIM to establish underlying local-to-global orrespondence via tailored self-distillation. These three learning objectives, namely , empower encoding both local features and global semantics of real faces. After pretraining, a vanilla ViT serves as a universal vision oundation odel for downstream ace ecurity tasks: cross-dataset deepfake detection, cross-domain face anti-spoofing, and unseen diffusion facial forgery detection. Extensive experiments on 10 public datasets demonstrate that our model transfers better than supervised pretraining, visual and facial self-supervised learning arts, and even outperforms task-specialized SOTA methods.
Live content is unavailable. Log in and register to view live content