CVPR Poster PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding

Poster

PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding

Hongjia Zhai · Hai Li · Zhenzhe Li · Xiaokun Pan · Yijia He · Guofeng Zhang

ExHall D Poster #332

[ Abstract ] [ Project Page ]

Sat 14 Jun 8:30 a.m. PDT — 10:30 a.m. PDT

Abstract:

Recently, 3D Gaussian Splatting (3DGS) has shown encouraging performance for open vocabulary scene understanding tasks. However, previous methods can not distinguish 3D instance-level information, which usually predicts a heatmap between the scene feature and text query. In this paper, we propose PanoGS, a novel and efficient 3D panoptic open vocabulary scene understanding approach. Technical-wise, to learn accurate 3D language features that can scale to large indoor scenarios, we adopt the pyramid tri-planes to model the latent continuous parametric feature space and use a 3D feature decoder to regress the multi-view fused 2D feature cloud. Besides, we propose language-guided graph cuts that synergistically leverage reconstructed geometry and learned language cues to group 3D Gaussian primitives into a set of super-primitives. To obtain 3D consistent instance, we perform graph clustering based segmentation with SAM-guided edge affinity computation between different super-primitives. Extensive experiments on widely used datasets show better or more competitive performance on 3D panoptic open vocabulary scene understanding.

Live content is unavailable. Log in and register to view live content