Skip to yearly menu bar Skip to main content


HOIAnimator: Generating Text-prompt Human-object Animations using Novel Perceptive Diffusion Models

Wenfeng Song · Xinyu Zhang · Shuai Li · Yang Gao · Aimin Hao · Xia HOU · Chenglizhao Chen · Ning Li · Hong Qin

Arch 4A-E Poster #61
[ ]
Wed 19 Jun 10:30 a.m. PDT — noon PDT


To date, the quest to rapidly and effectively produce human-object interaction (HOI) animations directly from textual descriptions stands at the forefront of computer vision research. The underlying challenge demands both a discriminating interpretation of language and a comprehensive physics-centric model supporting real-world dynamics. To ameliorate, this paper advocates HOIAnimator, a novel and interactive diffusion model with perception ability and also ingeniously crafted to revolutionize the animation of complex interactions from linguistic narratives. The effectiveness of our model is anchored in two ground-breaking innovations: (1) Our Perceptive Diffusion Models (PDM) brings together two types of models: one focused on human movements and the other on objects. This combination allows for animations where humans and objects move in concert with each other, making the overall motion more realistic. Additionally, we propose a Perceptive Message Passing (PMP) mechanism to enhance the communication bridging the two models, ensuring that the animations are smooth and unified; (2) We devise an Interaction Contact Field (ICF), a sophisticated model that implicitly captures the essence of HOIs. Beyond mere predictive contact points, the ICF assesses the proximity of human and object to their respective environment, informed by a probabilistic distribution of interactions learned throughout the denoising phase. Our comprehensive evaluation showcases HOIanimator's superior ability to produce dynamic, context-aware animations that surpass existing benchmarks in text-driven animation synthesis.

Live content is unavailable. Log in and register to view live content