Poster
DaCapo: Score Distillation as Stacked Bridge for Fast and High-quality 3D Editing
Yufei Huang · Bangyan Liao · Yuqi Hu · Haitao Lin · Lirong Wu · Siyuan Li · Cheng Tan · Zicheng Liu · Yunfan Liu · Zelin Zang · Chang Yu · Zhen Lei
[
Abstract
]
Abstract:
Score Distillation Sampling (SDS) has been successfully extended to text-driven 3D scene editing with 2D pretrained diffusion models. However, SDS-based editing methods suffer from lengthy optimization processes with slow inference and low quality. We attribute the issue of lengthy optimization to the stochastic optimization scheme used in SDS-based editing, where many steps may conflict with each other (e.g., the inherent trade-off between editing and preservation). To reduce this internal conflict and speed up the editing process, we propose to separate editing and preservation in time with a diffusion time schedule and frame the 3D editing optimization process as a diffusion bridge sampling process. Motivated by the analysis above, we introduce DaCapo, a fast diffusion sampling-like 3D editing method that incorporates a novel stacked bridge framework, which estimates a direct diffusion bridge between source and target distribution with only a pretrained 2D diffusion model. Specifically, It models the editing process as a combination of inversion and generation, where both processes happen simultaneously as a stack of Diffusion Bridges. DaCapo shows a 15× speed-up with comparable results to the state-of-the-art SDS-based method. It completes the process in just 2,500 steps on a single GPU and accommodates a variety of 3D representation methods.
Live content is unavailable. Log in and register to view live content