Skip to yearly menu bar Skip to main content


Poster

GenAssets: Generating in-the-wild 3D Assets in Latent Space

Ze Yang · Jingkang Wang · Haowei Zhang · Sivabalan Manivasagam · Yun Chen · Raquel Urtasun


Abstract:

High-quality 3D assets for traffic participants such as vehicles and motorcycles is critical for multi-sensor simulation, which is required for the safe end-to-end development of autonomy. Building assets from in-the-wild real-world data is key for diversity and realism, but existing neural-rendering based reconstruction methods are slow and generate assets that can only render close to the original viewpoints of observed actors, restricting usage in simulation. Recent diffusion-based generative models build complete and diverse assets, but perform poorly on in-the-wild driving scenes, where observed actors are captured under sparse and limited fields of view, and are partially occluded. In this work, we propose a 3D latent diffusion model that learns on in-the-wild LiDAR and camera data captured by a sensor platform and generates high quality 3D assets with complete geometry and appearance. Key to our method is a reconstruct-then-generate'' approach that first leverages occlusion-aware neural rendering trained over multiple scenes to build a high-quality latent space for objects, and then trains a generative diffusion model that operates on the latent space. We show our method outperforms existing reconstruction and generative-based methods, unlocking diverse and scalable content creation for simulation.

Live content is unavailable. Log in and register to view live content