Rank-Guided Pseudo-Bias Learning for Robust Black-Box Adaptation
Abstract
Pretrained vision encoders are widely used as frozen, black-box feature extractors, yet they often inherit spurious correlations that disproportionately harm underrepresented groups. We introduce \textbf{PLD-Debias}, a fully black-box debiasing framework that requires neither access to backbone parameters nor demographic annotations. Our method integrates three components: (1) \emph{Rank-Regularized Amplification}, a lightweight adapter that exaggerates latent spurious directions; (2) \emph{Unsupervised Pseudo-Bias Induction}, which clusters amplified features to infer high-fidelity proxy bias labels; and (3) \emph{Bias-Guided Refinement}, combining supervised contrastive alignment with cluster-aware adaptive margins to purify representations and equalize decision boundaries. We theoretically show that these components jointly tighten a worst-group risk bound under spurious correlations. Empirically, PLD-Debias achieves state-of-the-art worst-group accuracy across CelebA, Waterbirds, and CMNIST, improving performance by 3--5 points over prior black-box methods while maintaining average accuracy. Remarkably, our pseudo-bias labels align with ground-truth bias annotations at over 90\% fidelity, enabling oracle-level robustness without demographic supervision. Our results demonstrate that fairness and utility can be achieved through a plug-and-play classifier adapter for any frozen foundation model.