Poster
Generalizable Object Keypoint Localization from Generative Priors
Dongkai Wang · Jiang Duan · Liangjian Wen · Shiyu Xuan · Hao CHEN · Shiliang Zhang
Generalizable object keypoint localization is a fundamental computer vision task in understanding the object structure. It is challenging for existing keypoint localization methods because their limited training data cannot provide generalizable shape and semantic cues, leading to inferior performance and generalization capability. Instead of relying on large scale training data, this work tackles this challenge by exploiting the rich priors from large generative models. We propose a data-efficient generalizable localization method named GenLoc. GenLoc extracts the generative priors from a pre-trained image generation model by calculating the correlation map between image latent feature and condition embedding. Those priors are hence optimized with our proposed heatmap expectation loss to perform object keypoint localization. Benefited by the rich knowledge of generative priors in understanding of object semantics and structures, GenLoc achieves superior performance on various object keypoint localization benchmarks. It shows more substantial performance enhancements in cross-domain, few-shot and zero-shot evaluation settings, e.g., getting 20\%+ AP enhancement over CLAMP in various zero-shot settings.
Live content is unavailable. Log in and register to view live content