JUMP-Hand: Learning Joint-wise Uncertainty to Gate Mixture of View Experts for Multi-View 3D Hand Reconstruction
Abstract
In this paper, JUMP-Hand is proposed as a novel method for multi-view 3D hand reconstruction, which is the first to introduce probabilistic joint-wise uncertainty as an explicit gating mechanism to fuse multi-view information.Existing approaches usually fuse multi-view information by naïve pooling or implicit attention.However, they overlook that each hand joint exhibits varying visibility and reliability across views, which may degrade performance by indiscriminately aggregating noisy or unreliable information.For instance, one joint may be clearly visible in one view, while another joint is occluded in that view but visible in a different view.In contrast, JUMP-Hand addresses this by introducing the core insight of Mixture of Experts (MoE) and regard each 2D view as an expert.The key idea is that the reliability of each view expert is quantified through joint-wise uncertainty modeling, serving as a explicit gating signal to route experts' partial yet complementary clues for each joint in a coarse-to-fine reconstruction paradigm.In this design, uncertainty not only guides the uncertainty-aware triangulation for reliable 3D hand initialization during coarse stage, but also acts as a gating signal during refinement stage to adaptively aggregate multi-scale features from different view experts on a joint-wise basis, enabling robust 3D hand reconstruction.Extensive experiments on DexYCB-MV, HO3D-MV, and OakInk-MV demonstrate that our method achieves state-of-the-art results, validating the effectiveness of the proposed method with joint-wise uncertainty gating for reliable 3D hand reconstruction.The code will be released upon acceptance.