GaussianMatch: Semi-Supervised Regression with Pseudo-Label Filtering via Multi-View Gaussian Consistency
Abstract
Semi-Supervised Regression (SSR) is essential in domains like sentiment analysis, healthcare, etc., where labeled data is limited but unlabeled data is plentiful. Despite its practical importance, SSR remains underexplored due to the lack of effective pseudo-labeling strategies for continuous outputs. Unlike classification, regression lacks inherent confidence measures, making it harder to filter and trust pseudo-labels. This limitation permits low-quality pseudo-labels to propagate during training without proper validation, significantly amplifying prediction errors in semi-supervised regression frameworks. In this work, we propose GaussianMatch, a novel SSR framework enabling high-quality pseudo-label filtering, which selects reliable pseudo-labels through multi-view prediction consistency under feature-space smoothness assumptions. Our framework introduces two key innovations: 1) Gaussian Consistency Filter (GCF) that quantifies prediction consistency across weakly augmented views through Gaussian similarity scoring, retaining pseudo-labels only when all predictions fall within a confidence interval; 2) Adaptive Gaussian Standard Deviation Smoothing (AGDS) that enhances GCF's robustness through a Bayesian-regularized curriculum that phases confidence intervals from warm-up conservative bounds to progressively tightened thresholds. The use of AGDS ensures stable and reliable pseudo-label filtering throughout training. Extensive experiments demonstrate that GaussianMatch performs strongly across varying data conditions, showing notable robustness under extreme label scarcity. For instance, it outperforms the state of the art on UTKFace with only 30 labels, reducing error by 15.36\% and improving the Coefficient of Determination by 50.21\%.