Skip to yearly menu bar Skip to main content


Poster

Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis

Jiapeng Zhu · Ceyuan Yang · Kecheng Zheng · Yinghao Xu · Zifan Shi · Yifei Zhang · Qifeng Chen · Yujun Shen

ExHall D Poster #250
[ ] [ Paper PDF ]
[ Poster
Sat 14 Jun 3 p.m. PDT — 5 p.m. PDT

Abstract: Due to the difficulty in scaling up, generative adversarial networks (GANs) seem to be falling out of grace with the task of text-conditioned image synthesis. Sparsely activated mixture-of-experts (MoE) has recently been demonstrated as a valid solution to training large-scale models with limited resources. Inspired by this, we present Aurora, a GAN-based text-to-image generator that employs a collection of experts to learn feature processing, together with a sparse router to adaptively select the most suitable expert for each feature point. We adopt a two-stage training strategy, which first learns a base model at 64×64 resolution followed by an upsampler to produce 512×512 images. Trained with only public data, our approach encouragingly closes the performance gap between GANs and industry-level diffusion models, maintaining a fast inference speed. We will release the code and checkpoints to facilitate the community for more comprehensive studies of GANs.

Chat is not available.