Dynamic Label Noise Suppression with Optimal Teacher Pool for Facial Expression Recognition
Yuzhuang Yang ⋅ Xiaolin Tian ⋅ Qigong Sun
Abstract
Due to the inherent ambiguity of facial expressions and subjectivity in dataset labeling, learning with noisy labels remains a critical challenge in facial expression recognition (FER). The supervisory mechanism of teacher-student network offers a promising approach for noisy-labeled FER. However, this approach is prone to noise accumulation and gradual coupling between the teacher and student parameters during training. We propose an $\textbf{O}$ptimal $\textbf{T}$eacher $\textbf{P}$ool-driven dynamic label $\textbf{N}$oise $\textbf{S}$uppression framework for facial expression recognition (OTP-NS). Specifically, we construct an optimal teacher pool architecture that dynamically maintains multiple best teacher models while fusing their predictions, thereby mitigating noise accumulation and coupling of teacher-student parameters via update mechanisms. Furthermore, we develop two sample-level noise suppression parts: (1) Similarity-Aware Label Smoothing (SALS), diverging from the static smoothing strength in traditional label smoothing, automatically modulates the smoothing strength for teacher model based on prediction-label similarity, achieving fine-grained noise suppression. (2) Confidence-Weighted Logits (CWL), adaptively adjusting the classification loss of student model based on sample-to-centroid confidence metrics, alleviates the detrimental effects of noisy samples on model training. Extensive experiments on multiple benchmark datasets demonstrate that our method outperforms state-of-the-art approaches across various noise levels, validating the effectiveness of our proposed framework in learning robust representations from noisy data.
Successful Page Load