Forensics-Bench: A Comprehensive Forgery Detection Benchmark Suite for Large Vision Language Models
Jin Wang · Chenghui Lv · Xian Li · Shichao Dong · Huadong Li · kelu Yao · Chao Li · Wenqi Shao · Ping Luo
Abstract
Recently, the rapid development of AIGC has significantly boosted the diversities of fake media spread in the Internet, posing unprecedented threats to social security, politics, law, and etc.To detect the ever-increasingly **diverse** malicious fake media in the new era of AIGC, recent studies have proposed to exploit Large Vision Language Models (LVLMs) to design **robust** forgery detectors due to their impressive performance on a **wide** range of multimodal tasks.However, it still lacks a comprehensive benchmark designed to comprehensively assess LVLMs' discerning capabilities on forgery media.To fill this gap, we present Forensics-Bench, a new forgery detection evaluation benchmark suite to assess LVLMs across massive forgery detection tasks, requiring comprehensive recognition, location and reasoning capabilities on diverse forgeries.Forensics-Bench comprises $63,292$ meticulously curated multi-choicevisual questions, covering $112$ unique forgery detection types from $5$ perspectives: forgery semantics, forgery modalities, forgery tasks, forgery types and forgery models.We conduct thorough evaluations on $22$ open-sourced LVLMs and $3$ proprietary models GPT-4o, Gemini 1.5 Pro, and Claude 3.5 Sonnet, highlighting the significant challenges of comprehensive forgery detection posed by Forensics-Bench.We anticipate that Forensics-Bench will motivate the community to advance the frontier of LVLMs, striving for all-around forgery detectors in the era of AIGC.
Chat is not available.
Successful Page Load