Skip to yearly menu bar Skip to main content


Paper
in
Workshop: AI for Creative Visual Content Generation, Editing and Understanding

STAM: Zero-Shot Style Transfer using Diffusion Model via Attention Modulation

Masud Fahim · Nazmus Saqib · Jani Boutellier


Abstract:

Diffusion models serve as the basis of several different zero-shot image editing applications, including image generation and style transfer. The basic approach in style transfer using diffusion models involves swapping attention components between the provided content and style images. Straightforward interchange of these components can lead to inadequate style injection or loss of content image characteristics. This paper addresses shortcomings of attention-guided style transfer by two novel contributions: a) preserving content via dual path attention aggregation and b) maintaining the impact of style through modulation of attention components. The proposed STAM approach can provide aesthetically appealing yet content-preserving style transfer through a combination of these contributions and is also applicable to prompt-driven style transfer. STAM is validated by extensive qualitative and quantitative evaluations and compared to ten recent works that are largely outperformed by the proposed work. In addition to style transfer quality, STAM is also compared to previous work in terms of inference time and remains close to the fastest competing approaches.

Chat is not available.