Poster
Pose-Guided Temporal Enhancement for Robust Low-Resolution Hand Reconstruction
Kaixin Fan · Pengfei Ren · Jingyu Wang · Haifeng Sun · Qi Qi · Zirui Zhuang · Jianxin Liao
3D hand reconstruction is essential in non-contact human-computer interaction applications, but existing methods struggle with low-resolution images, which occur in slightly distant interactive scenes. Leveraging temporal information can mitigate the limitations of individual low-resolution images that lack detailed appearance information, thereby enhancing the robustness and accuracy of hand reconstruction. Existing temporal methods typically use joint features to represent temporal information, avoiding interference from redundant background information. However, joint features excessively disregard the spatial context of visual features, limiting hand reconstruction accuracy. We propose to integrate temporal joint features with visual features to construct a robust low-resolution visual representation. We introduce Triplane Features, a dense representation with 3D spatial awareness, to bridge the gap between the joint features and visual features that are misaligned in terms of representation form and semantics. Triplane Features are obtained by orthogonally projecting the joint features, embedding hand structure information into the 3D spatial context. Furthermore, we compress the spatial information of the three planes into a 2D dense feature thourgh Spatial-Aware Fusion to enhance the visual features. By using enhanced visual features enriched with temporal information for hand reconstruction, our method achieves competitive performance at much lower resolutions compared to state-of-the-art methods operating at high resolution on DexYCB, HanCo and H2O.
Live content is unavailable. Log in and register to view live content