Lafite : A Generative Latent Field for 3D Native Texturing
Abstract
Generating detailed and seamless textures for 3D meshes remains an open challenge. Recent image and video generation models, empowered by large-scale visual priors, are capable of producing highly detailed images and are thus promising for multi-view texture synthesis. However, evaluating texture quality involves multiple dimensions beyond visual fidelity. Multi-view back-projection often introduces seams and inconsistencies between different views or near occluded regions, while direct generation on UV-unwrapped maps suffers from UV distortions and ambiguities.Generating textures directly in 3D space offers an inherent advantage in ensuring continuity and spatial coherence, making it a critical and worthwhile research direction. Therefore, we systematically investigate 3D-native texture generation from the perspectives of representation and generation, and present current best practices for this approach.To this end, we employ a local vector field with a structured latent representation to model the joint distribution of texture and geometry. This design enables texture generation conditioned on high-fidelity geometric features within a unified latent space. Crucially, our approach is inherently free from occlusion artifacts, multi-view inconsistencies, and UV-related distortions caused by fragmented surface parameterizations. Extensive experiments demonstrate that our method produces high-quality, seamless textures and supports flexible downstream tasks such as editing and inpainting, marking a significant step forward in 3D-native texture generation.