Skip to yearly menu bar Skip to main content


Poster

MotionPro: A Precise Motion Controller for Image-to-Video Generation

Zhongwei Zhang · Fuchen Long · Zhaofan Qiu · Yingwei Pan · Wu Liu · Ting Yao · Tao Mei


Abstract:

Animating images with interactive motion control has garnered popularity for image-to-video (I2V) generation. Modern approaches typically regard the Gaussian filtered trajectory as sole motion control signal. Nevertheless, the flow approximation via Gaussian kernel limits the controllability of fine-grained movement, and commonly fails to disentangle object and camera moving. To alleviate these, we present MotionPro, a new recipe of region-wise motion controller that novelly leverages region-wise trajectory and motion mask to regulate fine-grained motion synthesis and identify exact target motion category (i.e., object or camera moving), respectively. Technically, MotionPro first estimates the flow maps on each training video via a tracking model, and then samples the region-wise trajectories from multiple local regions to simulate inference scenario. Instead of approximating flow distributions generally using a large Gaussian kernel, our region-wise trajectory provides a more precise control by directly employing trajectories in local region and thus manages to characterize fine-grained movement. A motion mask is simultaneously derived from the predicted flow maps to present holistic motion dynamics. To pursue natural motion control, MotionPro further strengthens video denoising with additional conditions of region-wise trajectory and motion mask in a feature modulation manner. More remarkably, we meticulously construct a benchmark, i.e., MC-Bench, with 1.1K user-annotated image-trajectory pairs, for the evaluation of both fine-grained and object-level I2V motion control. Extensive experiments conducted on WebVid-10M and MC-Bench demonstrate the effectiveness of MotionPro.

Live content is unavailable. Log in and register to view live content