Skip to yearly menu bar Skip to main content


Poster

Consistent and Controllable Image Animation with Motion Diffusion Models

Xin Ma · Yaohui Wang · Gengyun Jia · Xinyuan Chen · Tien-Tsin Wong · Yuan-Fang Li · Cunjian Chen


Abstract:

Diffusion models have achieved great progress in image animation due to powerful generative capabilities. However, maintaining spatio-temporal consistency with detailed information from the input static image over time (e.g., style, background, and objects in the input static image) and ensuring motion smoothness in animated video narratives guided by textual prompts still remains challenging. In this paper, we introduce Cinemo, a novel image animation approach towards achieving better image consistency and motion smoothness. In general, we propose two effective strategies at the training and inference stages of Cinemo to accomplish our goal. At the training stage, Cinemo focuses on learning the distribution of motion residuals, rather than directly predicting subsequent via a motion diffusion model. At the inference stage, a noise refinement technique based on discrete cosine transformation is introduced to mitigate sudden motion changes. Such strategies enable Cinemo to produce highly consistent, smooth, and motion-controllable results. Compared to previous methods, Cinemo offers simpler and more precise user controllability. Extensive experiments against several state-of-the-art methods, including both commercial tools and research approaches, across multiple metrics, demonstrate the effectiveness and superiority of our proposed approach.

Live content is unavailable. Log in and register to view live content