Skip to yearly menu bar Skip to main content


Poster

Depth Any Camera: Zero-Shot Metric Depth Estimation from Any Camera

Yuliang Guo · Sparsh Garg · S. Mahdi H. Miangoleh · Xinyu Huang · Liu Ren

ExHall D Poster #81
[ ] [ Paper PDF ]
Sun 15 Jun 2 p.m. PDT — 4 p.m. PDT

Abstract: Accurate metric depth estimation from monocular cameras is essential for applications such as autonomous driving, AR/VR, and robotics. While recent depth estimation methods demonstrate strong zero-shot generalization, achieving accurate metric depth across diverse camera types—particularly those with large fields of view (FoV) like fisheye and $360^\circ$ cameras—remains challenging. This paper introduces Depth Any Camera (DAC), a novel zero-shot metric depth estimation framework that extends a perspective-trained model to handle varying FoVs effectively. Notably, DAC is trained exclusively on perspective images, yet it generalizes seamlessly to fisheye and $360^\circ$ cameras without requiring specialized training. DAC leverages Equi-Rectangular Projection (ERP) as a unified image representation, enabling consistent processing of images with diverse FoVs. Key components include an efficient Image-to-ERP patch conversion for online ERP-space augmentation, a FoV alignment operation to support effective training across a broad range of FoVs, and multi-resolution data augmentation to address resolution discrepancies between training and testing. DAC achieves state-of-the-art zero-shot metric depth estimation, improving $\delta_1$ accuracy by up to 50\% on multiple indoor fisheye and $360^\circ$ datasets, demonstrating robust generalization across camera types while relying only on perspective training data.

Live content is unavailable. Log in and register to view live content