Skip to yearly menu bar Skip to main content


Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models

Hongjie Wang · Difan Liu · Yan Kang · Yijun Li · Zhe Lin · Niraj Jha · Yuchen Liu

Arch 4A-E Poster #141
[ ] [ Project Page ]
Thu 20 Jun 5 p.m. PDT — 6:30 p.m. PDT


Diffusion models (DMs) have exhibited superior performance in generating high-quality and diverse images. However, this exceptional performance comes at the cost of expensive architectural design, particularly with the attention module heavily used in leading models.Existing works mainly adopt a retraining process under one specific compression ratio to enhance the efficiency which is computational expensive and less scalable. To this end, we introduce the Attention-driven Training-free Efficient Diffusion Model (AT-EDM), a framework that leverages attention maps to perform run-time pruning of redundant tokens during inference without retraining. Specifically, we develop a single denoising step pruning strategy to identify redundant tokens and a similarity-based recovery method to restore tokens for convolution operation.Additionally, a novel pruning schedule is proposed to adjust the pruning budget across different denoising timesteps for better generation quality.Extensive evaluations show that our method performs favorably against prior arts in terms of efficiency (e.g., 38.8\% FLOPs saving over Stable Diffusion XL) while maintaining nearly the same FID and CLIP scores with the full model.

Live content is unavailable. Log in and register to view live content