Probabilistic Prompt Adaptation for Unified Image Aesthetics and Quality Assessment
Abstract
Recent advances in vision–language foundation models have enabled text-driven evaluation of image aesthetics and visual quality. However, existing models are typically optimized for fixed prompts or specific datasets, limiting their adaptability to diverse evaluation criteria. This paper presents \textit{Probabilistic Prompt Adaptation (PPA)}, a unified probabilistic framework that flexibly predicts aesthetic and quality scores conditioned on arbitrary text prompts. PPA formulates score prediction as a mixture over prompts, dynamically estimating prompt suitability based on both image content and task context. By marginalizing over prompts pre-sampled from a large language model (LLM), it enables annotation-free training using only triplets of task, image, and score. Experiments across multiple IAA and IQA benchmarks demonstrate that PPA achieves consistent and perceptually aligned prompt-based scoring, allowing fine-grained control over evaluation semantics.