GS-ASM: 2DGS-Supervised Active Stereo Matching
Abstract
Due to the lack of ground truth, existing methods of active stereo matching generally employ fully self-supervised learning to produce precise depth estimates. Although they can achieve promising results, their performance still has a noticeable gap compared with supervised models. To fill this gap, we propose a novel framework that synthesizes proxy labels to enable supervised training of deep active stereo networks without requiring any ground-truth depth. To expand the training data and generate disparity proxy labels, we develop an active 2D Gaussian Splatting (2DGS)-based synthesis method that explicitly models the scene geometry and the projected active pattern. Furthermore, to balance the varying contributions of different supervisions during training, we design a hybrid supervision regularization strategy that dynamically adjusts the loss weights to achieve stable optimization. We also contribute a real-world dataset captured by a handheld RealSense camera, along with our active 2DGS model, which facilitates future research on active stereo matching. Extensive qualitative and quantitative experiments demonstrate that our method achieves state-of-the-art performance on active stereo matching task. The code and dataset will be publicly released.