DualSplat: Robust 3D Gaussian Splatting via Pseudo-Mask Bootstrapping from Reconstruction Failures
Abstract
3D Gaussian Splatting achieves real-time photo-realistic rendering but struggles when training images contain transient objects that violate multi-view consistency. Existing methods face a fundamental dilemma: accurate transient detection requires well-reconstructed static scenes, yet clean reconstruction depends on reliable transient masks. This circular dependency causes persistent artifacts when both components are jointly optimized from poor initialization. We present DualSplat, a two-stage framework which sidesteps this dilemma by first generating pseudo masks from reconstruction failures, then using them to guide clean scene optimization. We observe that transient objects manifest as incomplete fragments during initial training, since they appear in only a subset of views. We consolidate these failures into pseudo masks via instance-level thresholding and a feature-residual filter guided by SAM2 boundaries. Then we trains a clean 3DGS model under pseudo-mask supervision, with a lightweight MLP refining masks online by progressively shifting from pseudo-priors to self-consistency as densification proceeds. Experiments on RobustNeRF and NeRF On-the-Go demonstrate that DualSplat achieves competitive performance with recent methods, with particularly strong results on scenes with high transient density.