Towards Fine-Grained Attribution: Instance-Aware Preference Optimization for Aligning Diffusion Models
Abstract
Direct Preference Optimization has achieved remarkable success in aligning diffusion models with human feedback. However, existing methods heavily rely on image-level preferences, which suffer from sparse rewards in the spatial dimension. This creates a fundamental misalignment: while an image may be globally preferred, it can contain locally inferior instances. Applying the same positive preference to these areas thus unfairly credits distracting regions while penalizing informative ones, leading to suboptimal performance and inefficient learning. To resolve this issue, we propose IAPO, an Instance-Aware Preference Optimization that introduces instance-level credit assignment to advance alignment from image-level to instance-level. We first construct a high-quality instance-level preference dataset by automatically identifying and relabeling corresponding instances in image pairs using vision-language models and object detection models. Leveraging this fine-grained dataset, we design a novel instance alignment loss with a dynamic reweighting mask that modulates instance-level loss within annotated bounding boxes, suppressing distractors to enforce fine-grained human preference alignment. Extensive experiments demonstrate that our method not only achieves state-of-the-art performance in multiple benchmarks but also attains higher training efficiency due to fine-grained instance-level preference labels.