Seeing Beyond 8bits: Subjective and Objective Quality Assessment of HDR-UGC Videos
SHRESHTH SAINI ⋅ Bowen Chen ⋅ Yilin Wang ⋅ Neil Birkbeck ⋅ Balu Adsumilli ⋅ Alan C.
Abstract
High Dynamic Range (HDR) user-generated (UGC) videos are rapidly proliferating across social platforms, yet most perceptual video quality assessment (VQA) systems remain tailored to Standard Dynamic Range (SDR). HDR’s higher bit depth, wide color gamut, and elevated luminance range expose distortions such as near-black crushing, highlight clipping, banding, and exposure flicker that amplify UGC artifacts and challenge SDR models. To catalyze progress, we curate \textbf{HDR-UGC-44K}, a large-scale subjective dataset of $\sim$44K videos from 6.5K sources with >1.5M crowd ratings, spanning diverse scenes, capture conditions, and compression settings. We further introduce \textbf{HDR-Q}, the first Multimodal Large Language Model (MLLM) for HDR-UGC VQA. We propose (i) a novel HDR-aware vision encoder to produce HDR-sensitive embeddings, and (ii) HDR-Aware Policy Optimization (HAPO), an RL finetuning framework that anchors reasoning to HDR cues. HAPO augments GRPO via an HDR–SDR contrastive KL that encourages token reliance on HDR inputs and a gaussian weighted regression reward for fine-grained MOS calibration. Across HDR-UGC-44K and public HDR-VQA benchmarks, HDR-Q delivers state-of-the-art performance. The dataset and code will be released.
Successful Page Load