Paper
in
Workshop: 2nd Workshop on Efficient and On-Device Generation (EDGE)

Latent Patched Efficient Diffusion Model For High Resolution Image Synthesis

Weiyun Jiang · Devendra Kumar Jangid · Seok-Jun Lee · Hamid Sheikh

Abstract

Generating high-resolution images using diffusion models has been challenging due to the requirement of large-scale computational resources and training datasets. Patch-based strategy during training and inference is used to resolve these computational challenges; however, strong grid artifacts appear due to stochastic nature of diffusion process. The overlapping patch strategy is used to avoid grid artifacts but inference time of diffusion also increases. In this paper, we propose a latent patch diffusion model for unconditional image generation that utilizes neighboring overlapping patches during training to avoid grid artifacts without increasing training, and inference time. Since, we apply overlapping strategy in latent space, our GPU VRAM requirement, training and inference time decreases compare to state-of-the-art-methods. Additionally, we introduce a multi-level encoder to obtain global semantic information for unconditional image generation. We conducted experiments on the CelebA and FFHQ dataset and demonstrated that our algorithm is efficient in terms of memory, training and inference time for unconditional image generation.

Chat is not available.