Poster
Hand-held Object Reconstruction from RGB Video with Dynamic Interaction
Shijian Jiang · Qi Ye · Rengan Xie · Yuchi Huo · Jiming Chen
This work aims to reconstruct the 3D geometry of a rigid object manipulated by one or both hands using monocular RGB video. Previous methods rely on Structure-from-Motion or hand priors to estimate relative motion between the object and camera, which typically assume textured objects or single-hand interactions. To accurately recover object geometry in dynamic hand-object interactions, we incorporate priors from 3D generation models into object pose estimation and propose semantic consistency constraints to solve the challenge of shape and texture discrepancy between the generated priors and observations. The poses are initialized, followed by joint optimization of the object poses and implicit neural representation. During the optimization, a novel pose outlier voting strategy with inter-view consistency is proposed to correct large pose errors. Experiments on three datasets demonstrate that our method significantly outperforms the state-of-the-art in reconstruction quality for both single- and two-hand scenarios.
Live content is unavailable. Log in and register to view live content