Skip to yearly menu bar Skip to main content


Poster

Motions as Queries: One-Stage Multi-Person Holistic Human Motion Capture

Kenkun Liu · Yurong Fu · Weihao Yuan · Jing Lin · Peihao Li · Xiaodong Gu · Lingteng Qiu · Haoqian Wang · Zilong Dong · Xiaoguang Han


Abstract:

Existing methods for capturing multi-person holistic human motions from a monocular video usually involve integrating the detector, the tracker, and the human pose & shape estimator into a cascaded system. Differently, we develop a one-stage multi-person holistic human motion capture system, which 1) employs only one network, enabling significant benefits from the end-to-end training on a large-scale dataset; 2) enables performance improving of the tracking module during training, avoiding being limited by a pre-trained tracker; 3) captures the motions of all individuals within a single shot, rather than tracking and estimating each person sequentially. In this system, each query within a temporal cross-attention module is responsible for the long motion of a specific individual, implicitly aggregating individual-specific information throughout the entire video. To further boost the proposed system from end-to-end training, we also construct a synthetic human video dataset, with multi-person and whole-body annotations. Extensive experiments across different datasets demonstrate both the efficacy and the efficiency of both the proposed method and the dataset. The code of our method will be made publicly available.

Live content is unavailable. Log in and register to view live content