Skip to yearly menu bar Skip to main content


Poster

Bridging Viewpoint Gaps: Geometric Reasoning Boosts Semantic Correspondence

Qiyang Qian · Hansheng Chen · Masayoshi Tomizuka · Kurt Keutzer · Qianqian Wang · Chenfeng Xu


Abstract:

Finding semantic correspondences between images is a challenging problem in computer vision, particularly under significant viewpoint changes. Previous methods rely on semantic features from pre-trained 2D models like Stable Diffusion and DINOv2, which often struggle to extract viewpoint-invariant features. To overcome this, we propose a novel approach that integrates geometric and semantic reasoning. Unlike prior methods relying on heuristic geometric enhancements, our framework fine-tunes DUSt3R on synthetic cross-instance data to reconstruct distinct objects in an aligned 3D space. By learning to deform these objects into similar shapes using semantic supervision, we enable efficient KNN-based geometric matching, followed by sparse semantic matching within local KNN candidates. While trained on synthetic data, our method generalizes effectively to real-world images, achieving up to 7.4-point improvements in zero-shot settings on the rigid-body subset of Spair-71K and up to 19.6-point gains under extreme viewpoint variations. Additionally, it accelerates runtime by up to 40 times, demonstrating both its robustness to viewpoint changes and its efficiency for practical applications.

Live content is unavailable. Log in and register to view live content