Skip to yearly menu bar Skip to main content


Taming Stable Diffusion for Text to 360 Panorama Image Generation

Cheng Zhang · Qianyi Wu · Camilo Cruz Gambardella · Xiaoshui Huang · Dinh Phung · Wanli Ouyang · Jianfei Cai

Arch 4A-E Poster #150
award Highlight
[ ]
Wed 19 Jun 5 p.m. PDT — 6:30 p.m. PDT


Generative models, e.g., Stable Diffusion, have enabled the creation of photorealistic images from text prompts.Yet, the generation of 360-degree panorama images from text remains a challenge, particularly due to the dearth of paired text-panorama data and the domain gap between panorama and perspective images.In this paper, we introduce a novel dual-branch diffusion model named PanFusion to generate a 360-degree image from a text prompt.We leverage the stable diffusion model as one branch to provide prior knowledge in natural image generation and register it to another panorama branch for holistic image generation.We propose a unique cross-attention mechanism with projection awareness to minimize distortion during the collaborative denoising process.Our experiments validate that PanFusion surpasses existing methods and, thanks to its dual-branch structure, can integrate additional constraints like room layout for customized panorama outputs.

Live content is unavailable. Log in and register to view live content