Poster
: Investigating and Leveraging Timestep-Aware Early-Bird Tickets in Diffusion Models for Efficient Training
Lexington Whalen · Zhenbang Du · Haoran You · Chaojian Li · Sixu Li · Yingyan (Celine) Lin
[
Abstract
]
Abstract:
Training diffusion models (DMs) is highly computationally demanding, necessitating multiple forward and backward passes across numerous timesteps. This challenge has motivated the exploration of various efficient DM training techniques. In this paper, we propose , a new and orthogonal efficient DM training approach by investigating and leveraging Early-Bird (EB) tickets—sparse subnetworks that manifest early in the training process and maintain high generation quality. We first investigate the existence of traditional EB tickets in DMs, enabling competitive generation quality without fully training a dense model. Then, we delve into the concept of diffusion-dedicated EB tickets, which draw on insights from varying importance of different timestep regions/periods. These tickets adapt their sparsity levels according to the importance of corresponding timestep regions, allowing for aggressive sparsity during non-critical regions while conserving computational resources for crucial timestep regions. Building on this, we develop an efficient DM training technique that derives timestep-aware EB tickets, trains them in parallel, and combines them through an ensemble during inference for image generation. This approach can significantly reduce training time both spatially and temporally—achieving 2.95.8 speedups over training unpruned dense models, and up to 10.3 faster training compared to standard train-prune-finetune pipelines—without compromising generative quality. Extensive experiments and ablation studies validate the existence of both traditional and timestep-aware EB tickets, as well as the effectiveness of our proposed EB-Diff-Train method. Our work not only enhances the understanding of DM training dynamics but also significantly improves training efficiency by exploiting temporal and spatial sparsity. All codes and models will be released upon acceptance.
Live content is unavailable. Log in and register to view live content