MotionScale: Reconstructing Appearance, Geometry, and Motion of Dynamic Scenes with Scalable 4D Gaussian Splatting
Abstract
Realistic reconstruction of dynamic 4D scenes is essential for understanding the physical world.Despite recent progress in monocular view synthesis, existing methods still struggle to recover accurate 3D geometry and temporally consistent motion in complex environments.To address these challenges, we propose MotionScale, a 4D Gaussian Splatting framework that scales efficiently to large scenes and extended sequences, enabling faithful reconstruction of high-fidelity scene structures and coherent motion representation under complex dynamics.To handle motion with arbitrary flexibility and long-term variation, we introduce a scalable motion field built upon cluster-based bases that adaptively grow to capture diverse motion patterns over time.Moreover, we introduce a progressive optimization strategy that extends naturally to unseen frames. This strategy comprises two propagation modules: 1) A background module that adapts to newly appearing objects, refines camera poses, and accounts for shadows; 2) A foreground module that refines motion through a three-stage process.Extensive experiments on challenging real-world datasets demonstrate that our MotionScale achieves superior reconstruction quality and motion consistency that significantly outperform prior 4D scene reconstruction methods.Our code will be open-sourced on paper acceptance.