SEA-Flow3D: Simplified, Efficient, and Accurate Scene Flow via Spatial Vector Sampling and Multi-scale Refinement
Abstract
Although depth-assisted scene flow estimation has advanced rapidly, mainstream dense frameworks (e.g., RAFT-3D) still rely primarily on 2D feature correlations to optimize 3D motion fields, which hinders their ability to exploit 3D structural priors effectively and consequently limits robustness in complex scenes. We present SEA-Flow3D, a simple, efficient, and accurate framework for dense scene flow estimation.At its core lies a Spatial Vector Sampling (SVS) module that jointly samples 3D coordinates and correlation volumes within the local neighborhood of matched points, producing a direction-aware correlation representation with explicit spatial vectors and providing strong geometric guidance for subsequent optimization. Following the simplicity-and-efficiency principle, SEA-Flow3D adopts a RAFT-style multi-scale recurrent refinement architecture, integrating an RNN-based optimizer with context-guided upsampling to achieve higher accuracy with fewer iterations. Extensive experiments on KITTI and Sintel demonstrate that SEA-Flow3D achieves state-of-the-art performance while maintaining remarkable efficiency and a lightweight design.