Skip to yearly menu bar Skip to main content


IBD-SLAM: Learning Image-Based Depth Fusion for Generalizable SLAM

Minghao Yin · Shangzhe Wu · Kai Han

Arch 4A-E Poster #88
[ ]
Thu 20 Jun 10:30 a.m. PDT — noon PDT


We present a method for visual SLAM that can generalize to unseen scenes without the need for retraining and enjoys fast optimization. Existing methods struggle to generalize to novel scenes, i.e., they are optimized on a per-scene basis. Recently, neural scene representations have shown promise in SLAM to produce dense 3D reconstruction with high quality, at the cost of long training time. To overcome the limitations on generalization and efficiency, we propose IBD-SLAM, an Image-Based Depth fusion framework for generalizable SLAM. In particular, we adopt a Neural Radiance Field (NeRF) for scene representation. Inspired by image-based rendering, instead of learning a fixed grid of scene representation, we propose to learn image-based depth fusion, by deriving xyz-maps from the depth maps inferred from the given images. Once trained, the model can be applied to new uncalibrated monocular RGBD videos of unseen scenes, without the need for retraining. For any given new scene, only the pose parameters need to be optimized, which is very efficient. We thoroughly evaluate IBD-SLAM on public visual SLAM benchmarks, outperforming the previous state-of-the-art while being 10 times faster.

Live content is unavailable. Log in and register to view live content