Breaking Spurious Correlations: Uncertainty-Driven Causal Transformers for AU Detection
Abstract
Facial Action Unit (AU) detection suffers from limited annotated data, severe class imbalance, and label noise, which often result in overfitting and degraded performance. We propose a novel framework that synergizes Uncertainty-aware Transformers with Causal Intervention to address these challenges. By modeling attention weights as Gaussian distributions, our probabilistic Transformer captures inter-AU dependencies and epistemic uncertainty. An uncertainty-guided loss weighting strategy further mitigates data imbalance by adaptively emphasizing reliable predictions. Moreover, a causal intervention module is introduced to eliminate spurious correlations caused by confounders, ensuring that the learned AU relationships reflect true causality. Extensive experiments on BP4D and DISFA demonstrate that our framework achieves state-of-the-art performance with superior robustness and generalization.