Feed-forward Gaussian Registration for Head Avatar Creation and Editing
Malte Prinzler ⋅ Paulo Gotardo ⋅ Siyu Tang ⋅ Timo Bolkart
Abstract
We present MATCH (Multi-view Avatars from Topologically Corresponding Heads), a multi-view Gaussian registration method for high-quality head avatar creation and editing. State-of-the-art multi-view head avatars require time-consuming head tracking, which is followed by an expensive avatar optimization, often resulting in a total creation time that exceeds one day. MATCH instead directly predicts Gaussian splat textures in correspondence from calibrated multi-view images in 0.5 seconds per frame. While the learned intra-subject correspondence across frames allows us to quickly build personalized head avatars, correspondence across subjects enables various applications such as expression transfer, optimization-free tracking, semantic editing, and identity interpolation. We learn to establish such correspondences end-to-end, with a transformer-based model that predicts textures of Gaussian splats in the fixed UV layout of a template mesh. To this end, we introduce a novel registration-guided attention block, in which each UV map token attends exclusively to image tokens depicting its corresponding mesh region. MATCH outperforms existing methods for novel-view synthesis, geometry registration, and head avatar generation, the latter being $10\times$ faster than the qualitatively closest baseline. Code and model weights will be published upon acceptance.
Successful Page Load