COPYLENS: Towards Copyrighted Characters Infringement Detection via Copyright-Aware Prompt Learning
Abstract
Recent advances in text-to-image (T2I) generation can produce highly resembling images of copyrighted characters, often indistinguishable from official depictions, raising serious concerns about intellectual property infringement. Consequently, robust detection of copyright character infringement is urgently needed. Yet, existing methods exhibit limited alignment with human judgments regarding the likelihood of infringement. To bridge this gap, we propose \textsc{CopyLens}, a novel prompt optimization framework that automatically refines textual prompts for vision-language model-based detectors to better match human infringement judgments. Our approach establishes a closed-loop refinement process between a large vision-language model (LVLM) and a large language model (LLM): the LVLM assesses generated images for copyright detection, while the LLM iteratively optimizes detection prompts via meta-prompting, guided by feedback signals derived from human annotation consistency. To facilitate the assessment of prompt-human alignment, we introduce \textsc{CopyChars}, a new large-scale dataset of over 7,000 AI-generated images spanning more than 100 popular copyrighted characters, along with detailed human annotations on potential infringement. Extensive experiments on \textsc{CopyChars} show that the proposed \textsc{CopyLens} can improve detection performance by 5\% to 10\% compared to recent state-of-the-art methods. This work offers a scalable and automated solution for visual copyright protection and highlights the critical role of prompt engineering.