PAF: Perturbation-Aware Filtering for Open-Set Semi-Supervised Learning
Abstract
Open-set semi-supervised learning (OSSL) has achieved notable progress in exploiting unlabeled data, yet most existing methods overlook the distinct sensitivities of in-distribution (ID) and out-of-distribution (OOD) samples to semantic-preserving perturbations, resulting in unreliable OOD sample filtering. We address this gap by leveraging the behavioral difference between ID and OOD samples under perturbations and extend it into a representation-level signal for reliable OOD filtering. Specifically, we propose a novel filtering strategy, Perturbation-Aware Filtering (PAF), which identifies OOD samples by measuring the representation instability under semantic-preserving perturbations. We then integrate PAF into a carefully designed two-stage training framework, allowing the model to exploit abundant unlabeled data in the first stage and gradually adapt to the open-set setting with limited labels in the second stage. Extensive experimental results on widely-used OSSL benchmarks demonstrate that our proposed PAF approach achieves superior performance compared to state-of-the-art OSSL methods. The source code will be released publicly.