CVPR Poster Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model

Poster

Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model

Zhaochong An · Guolei Sun · Yun Liu · Runjia Li · Junlin Han · Ender Konukoglu · Serge Belongie

ExHall D Poster #113

[ Abstract ] [ Project Page ] [ Paper PDF ]

Sat 14 Jun 3 p.m. PDT — 5 p.m. PDT

Abstract:

Generalized few-shot 3D point cloud segmentation (GFS-PCS) enables model adaptation to new classes with a few support samples while retaining base class segmentation. Existing GFS-PCS approaches focus on enhancing prototypes via interacting with support or query features but remain limited by the sparse knowledge from few-shot samples. Meanwhile, 3D vision-language models (3D VLMs), designed to generalize across open-world novel classes by aligning with language models, contain rich but noisy novel class knowledge. In this work, we introduce a GFS-PCS framework that synergizes dense but noisy pseudo-labels from 3D VLMs with precise yet sparse few-shot samples to maximize the strengths of both, named GFS-VL. Specifically, we present a prototype-guided pseudo-label selection to filter low-quality regions, followed by an adaptive infilling strategy that combines knowledge from pseudo-label contexts and few-shot samples to adaptively label the filtered, unlabeled areas. Additionally, to further utilize few-shot samples, we design a novel-base mix strategy to embed few-shot samples into training scenes, preserving essential context for improved novel class learning. Moreover, recognizing the limited diversity in current GFS-PCS benchmarks, we introduce two challenging benchmarks with diverse novel classes for comprehensive generalization evaluation. Experiments validate the effectiveness of our framework across models and datasets. Our approach and benchmarks provide a solid foundation for advancing GFS-PCS in real-world applications. The code will be released.

Live content is unavailable. Log in and register to view live content