Skip to yearly menu bar Skip to main content


Poster

Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion

Hao Wen · Zehuan Huang · Yaohui Wang · Xinyuan Chen · Lu Sheng


Abstract:

Existing image-to-3D creation methods typically split the task into two individual stage: multi-view image generation and 3D reconstruction, leading to two main limitations: (1) In multi-view generation stage, the multi-view generated images present a challenge to preserving 3D consistency;; (2) In 3D reconstruction stage, there is a domain gap between real training data and generated multi-view input during inference. To address these issues, we propose Ouroboros3D, end-to-end trainable framework that integrates multi-view generation and 3D reconstruction into a recursive diffusion process through feedback mechanism.Our framework operates through iterative cycles where each cycle consists of a denoising process and a reconstruction step.By incorporating a 3D-aware feedback mechanism, our multi-view generative model leverages the explicit 3D geometric information (e.g. texture, position) from the feedback of reconstruction results of the previous process as conditions, thus modeling consistency at the 3D geometric level. Furthermore, through joint training of both the multi-view generative and reconstruction models, we alleviate reconstruction stage domain gap and enable mutual enhancement within the recursive process. Experimental results demonstrate that Ouroboros3D outperforms methods that treat these stages separately and those that combine them only during inference, achieving superior multi-view consistency and producing 3D models with higher geometric realism.

Live content is unavailable. Log in and register to view live content