Skip to yearly menu bar Skip to main content


Poster

Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting

Runsong Zhu · Shi Qiu · ZHENGZHE LIU · Ka-Hei Hui · Qianyi Wu · Pheng-Ann Heng · Chi-Wing Fu


Abstract:

Lifting multi-view 2D instance segmentation to a radiance field has proven to be effective to enhance 3D understanding. Existing methods for addressing multi-view inconsistency in 2D segmentations either use linear assignment for end-to-end lifting, resulting in inferior results, or adopt a two-stage solution, which is limited by complex pre- or post-processing. In this work, we design a new object-aware lifting approach to realize an end-to-end lifting pipeline based on the 3D Gaussian representation, such that we can jointly learn Gaussian-level point features and a global object-level codebook across multiple views. To start, we augment each Gaussian point with an additional Gaussian-level feature learned using a contrastive loss to encode instance information. Importantly, we introduce a learnable \textit{object-level codebook} to account for individual objects in the scene for an explicit object-level understanding and associate the encoded object-level features with the Gaussian-level point features for segmentation predictions. Further, we formulate the association learning module and the noisy label filtering module for effective and robust codebook learning. We conduct experiments on three benchmarks: LERF-Masked, Replica, and Messy Rooms datasets. Both qualitative and quantitative results manifest that our approach clearly outperforms existing methods in terms of segmentation quality and time efficiency.

Live content is unavailable. Log in and register to view live content