A³: Towards Advertising Aesthetic Assessment
Abstract
Advertising images significantly impact commercial conversion rates and brand equity, yet current evaluation methods rely on subjective judgments, lacking scalability, standardized criteria, and interpretability. To address these challenges, we present A³ (Advertising Aesthetic Assessment), a comprehensive framework encompassing four components: a paradigm (A³-Law), a dataset (A³-Dataset), a multimodal large language model (A³-Align), and a benchmark (A³-Bench). Central to A³ is a theory-driven paradigm, A³-Law, comprising three hierarchical stages: (1) Perceptual Attention, evaluating perceptual image signals for their ability to attract attention; (2) Formal Interest, assessing formal composition of image color and spatial layout in evoking interest; and (3) Desire Impact, measuring desire evocation from images and their persuasive impact. Building on A³-Law, we construct A³-Dataset with 120K instruction-response pairs from 30K advertising images, each richly annotated with multi-dimensional labels and Chain-of-Thought (CoT) rationales. We further develop A³-Align, trained under A³-Law with CoT-guided learning on A³-Dataset. Extensive experiments on A³-Bench demonstrate that A³-Align achieves superior alignment with A³-Law compared to existing models, and this alignment generalizes well to quality advertisement selection and prescriptive advertisement critique, indicating its potential for broader deployment.