Poster Fri, Jun 5, 2026 • 9:45 AM – 11:45 AM PDT ExHall A-F 459

GauMVC: Generative Decoupled Gaussian Representation for Human-centric Multi-view Video Compression

Ruoke Yan ⋅ Mingjia Yang ⋅ Xinfeng Zhang ⋅ Haocheng Tang ⋅ Qian Yin ⋅ Zhipin Deng ⋅ Kai Zhang ⋅ Li zhang ⋅ Siwei Ma

Abstract

Human-centric multi-view video has a clear semantic structure: a static background and dynamic human motion. We propose a generative compression framework that explicitly decouples these components. The background is modeled once with 3D Gaussian Splatting, while the human is represented by a personalized Gaussian avatar reconstructed from a sparse set of key views that are transmitted only once and driven by compact per-frame pose parameters from the Skinned Multi-Person Linear (SMPL) model. The encoder sends only three elements: the background, the key views, and the SMPL parameters, enabling high-fidelity multi-viewpoint synthesis at dramatically reduced bitrates. This shifts compression from low-level redundancy removal to semantics-aware generative modeling. Experiments across multiple human-centric datasets demonstrate superior rate–distortion performance, particularly for long and densely captured sequences, and naturally enable semantic editing.