Poster
Boosting Video Quality Assessment via Saliency-guided Local Perception
Yunpeng Qu · Kun Yuan · Qizhi Xie · Ming Sun · Chao Zhou · Jian Wang
Video Quality Assessment (VQA), which intends to predict the perceptual quality of videos, has attracted increasing attention. Due to factors like motion blur or specific distortions, the quality of different regions in a video varies. Recognizing the region-wise local quality within a video is beneficial for assessing overall quality and can guide us in adopting finegrained enhancement or transcoding strategies. Due to the difficulty and heavy cost of annotating region-wise quality, the lack of ground truth constraints from relevant datasets further complicates the utilization of local perception. Inspired by the Human Visual System (HVS) that the overall quality is closely related to the local texture of different regions and their visual saliency, we propose a Saliency-guided Local Perception VQA (SLP-VQA) framework, which aims to effectively assess both saliency and local texture, thereby facilitating the assessment of overall quality. Our framework extracts global visual saliency and allocates attention using FusionWindow Attention (FWA) while incorporating a Local Perception Constraint (LPC) to mitigate the reliance of regional texture perception on neighboring areas. Compared to SOTA methods, SLP-VQA obtains significant improvements across multiple scenarios on five VQA benchmarks. Furthermore, to assess local perception, we establish a new Local Perception Visual Quality (LPVQ) dataset with region-wise annotations. Experimental results and visualizations demonstrate the capability of SLP-VQA in perceiving local distortions. SLP-VQA models and the LPVQ dataset will be publicly available.
Live content is unavailable. Log in and register to view live content