Skip to yearly menu bar Skip to main content


Exploring Pose-Aware Human-Object Interaction via Hybrid Learning

EASTMAN Z Y WU · Yali Li · Yuan Wang · Shengjin Wang

Arch 4A-E Poster #311
[ ]
Thu 20 Jun 5 p.m. PDT — 6:30 p.m. PDT

Abstract: Human-Object Interaction (HOI) detection plays a crucial role in visual scene comprehension. In recent advancements, two-stage detectors have taken a prominent position. However, they are encumbered by two primary challenges. First, the misalignment between feature representation and relation reasoning gives rise to a deficiency in discriminative features crucial for interaction detection. Second, due to sparse annotation, the second-stage interaction head generates numerous candidate $<$human, object$>$ pairs, with only a small fraction receiving supervision. Towards these issues, we propose a hybrid learning method based on pose-aware HOI feature refinement. Specifically, we devise pose-aware feature refinement that encodes spatial features by considering human body pose characteristics. It can direct attention towards key regions, ultimately offering a wealth of fine-grained features imperative for HOI detection. Further, we introduce a hybrid learning method that combines HOI triplets with probabilistic soft labels supervision, which is regenerated from decoupled verb-object pairs. This method explores the implicit connections between the interactions, enhancing model generalization without requiring additional data. Our method establishes state-of-the-art performance on HICO-DET benchmark and excels notably in detecting rare HOIs.

Live content is unavailable. Log in and register to view live content