CVPR Poster TopNet: Transformer-Efficient Occupancy Prediction Network for Octree-Structured Point Cloud Geometry Compression

Poster

TopNet: Transformer-Efficient Occupancy Prediction Network for Octree-Structured Point Cloud Geometry Compression

Xinjie Wang · Yifan Zhang · Ting Liu · Xinpu Liu · Ke Xu · Jianwei Wan · Yulan Guo · Hanyun Wang

ExHall D Poster #110

[ Abstract ] [ Paper PDF ]

Sun 15 Jun 2 p.m. PDT — 4 p.m. PDT

Abstract:

Efficient Point Cloud Geometry Compression (PCGC) with a lower bits per point (BPP) and higher peak signal-to-noise ratio (PSNR) is essential for the transportation of large-scale 3D data. Although octree-based entropy models can reduce BPP without introducing geometry distortion, existing CNN-based models struggle with limited receptive fields to capture long-range dependencies, while Transformer-built architectures always neglect fine-grained details due to their reliance on global self-attention. In this paper, we propose a Transformer-efficient occupancy prediction network, termed TopNet, to overcome these challenges by developing several novel components: Locally-enhanced Context Encoding (LeCE) for enhancing the translation-invariance of the octree nodes, Adaptive-Length Sliding Window Attention (AL-SWA) for capturing both global and local dependencies while adaptively adjusting attention weights based on the input window length, Spatial-Gated-enhanced Channel Mixer (SG-CM) for efficient feature aggregation from ancestors and siblings, and Latent-guided Node Occupancy Predictor (LNOP) for improving prediction accuracy of spatially adjacent octree nodes. Comprehensive experiments across both indoor and outdoor point cloud datasets demonstrate that our TopNet achieves state-of-the-art performance with fewer parameters, further advancing the reduction-efficiency boundaries of PCGC.

Live content is unavailable. Log in and register to view live content