CVPR Poster DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes

Poster

DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes

Jinxiu Liu · Shaoheng Lin · Yinxiao Li · Ming-Hsuan Yang

[ Abstract ] [ Project Page ] [ Paper PDF ]

Fri 13 Jun 2 p.m. PDT — 4 p.m. PDT

Abstract:

Generating 360° panoramic and dynamic world content is important for spatial intelligence. The increasing demand for immersive AR/VR applications has heightened the need to generate high-quality scene-level dynamic content. However, most video diffusion models are constrained by limited resolution and aspect ratio, which restricts their applicability to scene-level dynamic content synthesis. In this work, we propose the DynamicScaler, addressing these challenges by enabling spatially scalable and panoramic dynamic scene synthesis that preserves coherence across panoramic scenes of arbitrary size. Specifically, we introduce a Offset Shifting Denoiser, facilitating efficient, synchronous, and coherent denoising panoramic dynamic scenes via a diffusion model with fixed resolution through a seamless rotating Window, which ensures seamless boundary transitions and consistency across the entire panoramic space, accommodating varying resolutions and aspect ratios. Additionally, we employ a Global Motion Guidance mechanism to ensure both local detail fidelity and global motion continuity. Extensive experiments demonstrate our method achieves superior content and motion quality in panoramic scene-level video generation, offering a training-free, efficient, and scalable solution for immersive dynamic scene creation with only 11GB VRAM (Video Random Access Memory) regardless of the output video resolution.

Chat is not available.