Processing math: 100%
Skip to yearly menu bar Skip to main content


Poster

Depth Any Camera: Zero-Shot Metric Depth Estimation from Any Camera

Yuliang Guo · Sparsh Garg · S. Mahdi H. Miangoleh · Xinyu Huang · Liu Ren


Abstract: Accurate metric depth estimation from monocular cameras is essential for applications such as autonomous driving, AR/VR, and robotics. While recent depth estimation methods demonstrate strong zero-shot generalization, achieving accurate metric depth across diverse camera types—particularly those with large fields of view (FoV) like fisheye and 360 cameras—remains challenging. This paper introduces Depth Any Camera (DAC), a novel zero-shot metric depth estimation framework that extends a perspective-trained model to handle varying FoVs effectively. Notably, DAC is trained exclusively on perspective images, yet it generalizes seamlessly to fisheye and 360 cameras without requiring specialized training. DAC leverages Equi-Rectangular Projection (ERP) as a unified image representation, enabling consistent processing of images with diverse FoVs. Key components include an efficient Image-to-ERP patch conversion for online ERP-space augmentation, a FoV alignment operation to support effective training across a broad range of FoVs, and multi-resolution data augmentation to address resolution discrepancies between training and testing. DAC achieves state-of-the-art zero-shot metric depth estimation, improving δ1 accuracy by up to 50\% on multiple indoor fisheye and 360 datasets, demonstrating robust generalization across camera types while relying only on perspective training data.

Live content is unavailable. Log in and register to view live content