Poster Sat, Jun 6, 2026 • 3:45 PM – 5:45 PM PDT ExHall A & F 117

UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement

Weiqi Li ⋅ Xuanyu Zhang ⋅ Bin Chen ⋅ Jingfen Xie ⋅ Yan Wang ⋅ Kexin Zhang ⋅ Junlin Li ⋅ Li zhang ⋅ Jian Zhang ⋅ Shijie Zhao

Paper PDF

Abstract

Image quality assessment (IQA) and image restoration are fundamental problems in low-level vision. Although IQA and restoration are closely connected conceptually, most existing work treats them in isolation. Recent advances in unified multimodal understanding-generation models demonstrate promising results and indicate that stronger understanding can improve generative performance. This motivates a single model that unifies IQA and restoration and explicitly studies how IQA can guide restoration, a setting that remains largely underexplored yet highly valuable. In this paper, we propose UARE, to our knowledge the first Unified vision-language model for image quality Assessment, Restoration, and Enhancement. Built on pretrained unified understanding and generation models, we introduce a two-stage training framework. First, a progressive, easy-to-hard schedule expands from single-type distortions to higher-order mixed degradations, enabling UARE to handle multiple degradations. Second, we perform unified fine-tuning of quality understanding and restoration with interleaved text-image data, aligning IQA signals with restoration objectives. Through multi-task co-training, UARE leverages IQA to boost restoration and enhancement performance. Extensive experiments across IQA, restoration, and enhancement tasks demonstrate the effectiveness of UARE. The code and models will be made available.