TopoMA: Topology-Guided Multi-Agent Dense RGB 3D Reconstruction via Distributed Inference
Abstract
Multi-agent 3D reconstruction, as a key technology for large-scale VR/AR, robot swarms, and digital twins, has attracted growing attention. Recent end-to-end 3D reconstruction methods achieve strong performance in single-agent scenarios, but they are difficult to directly extend to multi-agent collaborative settings, where they often suffer from unstable tracking, excessive memory consumption, and frequent loop-closure failures, thus failing to meet real-time and large-scale deployment requirements. To address these issues, we propose TOPOMA, a real-time end-to-end 3D reconstruction framework tailored for multi-agent collaboration. TOPOMA explicitly models the spatial topological structure of the scene and tightly couples it with end-to-end representation learning, thereby jointly solving core challenges such as inter-agent spatial alignment and submap fusion. Concretely, we introduce topology skeleton modeling and optimization, decentralized loop closure, and topology-guided residual transport, and build upon them a fully distributed inference architecture in which each agent can independently store, reconstruct, and incrementally optimize its map while collaborating through lightweight topological information. Extensive experiments demonstrate that, compared with existing methods, TOPOMA achieves consistently higher trajectory accuracy, reconstruction quality, robustness, and topological consistency, showing superior adaptability and scalability.