Skip to yearly menu bar Skip to main content


Poster

Dual Prompting for Image Restoration across Full-Scene with Diffusion Transformers

Dehong Kong · Fan Li · Zhixin Wang · Jiaqi Xu · Renjing Pei · Wenbo Li · Wenqi Ren


Abstract:

Recent state-of-the-art image restoration methods mostly adopt latent diffusion models with U-Net backbones, yet still facing challenges in achieving high-quality restoration due to their limited capabilities. Diffusion transformers (DiTs), like SD3, are emerging as a promising alternative because of their better quality with scalability. However, previous conditional control methods for U-Net-based diffusion models, such as ControlNet, are not well-suited for DiTs. In this paper, we introduce DPIR (Dual Prompting Image Restoration), a novel DiT-based image restoration method that effectivly extracts conditional information of low-quality images from multiple perspectives. Specifically, DPIR consits of two branches: a low-quality image prior conditioning branch and a dual prompting control branch. into the DiT with high training efficiency. More importantly, we believe that in image restoration, the image's textual description alone cannot fully capture its rich visual characteristics. Therefore, a dual prompting module is designed to provide DiT with additional visual cues, capturing both global context and local appearance. The extracted global-local visual prompts as extra conditional control, together with text prompts, greatly enhance the quality and fidelity of the restoration. Extensive experimental results demonstrate that DPIR delivers superior image restoration performance with broad applicability.

Live content is unavailable. Log in and register to view live content