Skip to yearly menu bar Skip to main content


Generative Image Dynamics

Zhengqi Li · Richard Tucker · Noah Snavely · Aleksander Holynski

Arch 4A-E Poster #117
[ ]
Fri 21 Jun 5 p.m. PDT — 6:30 p.m. PDT
Oral presentation: Orals 6B Image & Video Synthesis
Fri 21 Jun 1 p.m. PDT — 2:30 p.m. PDT


We present an approach to modeling an image-space prior on scene motion. Our prior is learned from a collection of motion trajectories extracted from real video sequences depicting natural, oscillatory dynamics of objects such as trees, flowers, candles, and clothes swaying in the wind. We model dense, long-term motion in the Fourier domain as spectral volumes, which we find are well-suited to prediction with diffusion models. Given a single image, our trained model uses a frequency-coordinated diffusion sampling process to predict a spectral volume, which can be converted into a motion texture that spans an entire video. Along with an image-based rendering module, the predicted motion representation can be used for a number of downstream applications, such as turning still images into seamlessly looping videos, or allowing users to realistically interact with objects in a real picture by interpreting the spectral volumes as image-space modal bases, which approximate object dynamics.

Live content is unavailable. Log in and register to view live content