Skip to yearly menu bar Skip to main content


Control4D: Efficient 4D Portrait Editing with Text

Ruizhi Shao · Jingxiang Sun · Cheng Peng · Zerong Zheng · Boyao ZHOU · Hongwen Zhang · Yebin Liu

Arch 4A-E Poster #428
[ ] [ Project Page ]
Wed 19 Jun 10:30 a.m. PDT — noon PDT


Recent years have witnessed considerable achievements in editing images with text instructions. When applying these editors to dynamic scene editing, the new-style scene tends to be temporally inconsistent due to the frame-by-frame nature of these 2D editors. To tackle this issue, we propose Control4D, a novel approach for high-fidelity and temporally consistent 4D portrait editing. Control4D is built upon an efficient 4D representation with a 2D diffusion-based editor. Instead of using direct supervision from the editor, our method learns a 4D generator from it and avoids the inconsistent supervision signals. Specifically, we employ a discriminator to learn the generation distribution based on the edited images and then update the generator with the discrimination signals. For more stable training, multi-level information is extracted from the edited images and used to facilitate the learning of the generator. Experimental results show that Control4D surpasses previous approaches and achieves more photo-realistic and consistent 4D editing performances.

Live content is unavailable. Log in and register to view live content