DRiffusion: Draft-and-Refine Process Parallelizes Diffusion Models with Ease
Runsheng Bai ⋅ Chengyu Zhang ⋅ Yangdong Deng
Abstract
Diffusion models have achieved remarkable success in generating high-fidelity content but suffer from slow, iterative sampling, resulting in high latency that limits their use in interactive applications. We introduce DRiffusion, a parallel sampling framework that parallelizes diffusion inference through a draft-and-refine process. DRiffusion employs skip connections to generate multiple draft states for future timesteps and computes their corresponding noises in parallel, which are then used in the standard denoising process to produce refined results. Theoretically, our method achieves an acceleration rate of $\tfrac{1}{n}$ or $\tfrac{2}{n+1}$, depending on whether the conservative or aggressive mode is used, where $n$ denotes the number of devices. Empirically, DRiffusion attains 1.5×–4× speedup on Stable Diffusion 2.1 with minimal degradation in generation quality. On MS-COCO dataset, both FID and CLIP remain close to those of the original sampler: averaged across configurations, DRiffusion even improves FID by 0.45 and incurs a negligible 0.06 drop in CLIP score. These results show that DRiffusion delivers substantial acceleration while largely preserving perceptual quality.
Successful Page Load