Skip to yearly menu bar Skip to main content


Visual Prompt Tuning for Generative Transfer Learning

Kihyuk Sohn · Huiwen Chang · José Lezama · Luisa Polania · Han Zhang · Yuan Hao · Irfan Essa · Lu Jiang

West Building Exhibit Halls ABC 319


Learning generative image models from various domains efficiently needs transferring knowledge from an image synthesis model trained on a large dataset. We present a recipe for learning vision transformers by generative knowledge transfer. We base our framework on generative vision transformers representing an image as a sequence of visual tokens with the autoregressive or non-autoregressive transformers. To adapt to a new domain, we employ prompt tuning, which prepends learnable tokens called prompts to the image token sequence and introduces a new prompt design for our task. We study on a variety of visual domains with varying amounts of training images. We show the effectiveness of knowledge transfer and a significantly better image generation quality. Code is available at

Chat is not available.