Poster Sun, Jun 7, 2026 • 2:30 PM – 4:30 PM PDT ExHall A 426

FailureAtlas: Mapping the Failure Landscape of T2I Models via Active Exploration

Muxi Chen ⋅ Zhaohua Zhang ⋅ Chenchen Zhao ⋅ Mingyang Chen ⋅ Wenyu Jiang ⋅ Tianwen Jiang ⋅ Jianhuan Zhuo ⋅ Yu Tang ⋅ Qiuyong Xiao ⋅ Jihong Zhang ⋅ Qiang Xu

Abstract

Static benchmark-driven evaluation has provided a valuable foundation for analyzing Text-to-Image (T2I) models.However, the fixed and predetermined prompt sets in benchmarks inherently limit diagnostic depth, making it difficult to uncover the full landscape of models' systematic failures or isolate their root causes.We argue for a complementary paradigm: $\textbf{active exploration}$, and introduce $\textbf{FailureAtlas}$, the first framework designed to autonomously explore and map the vast failure landscapes of T2I models at scale.Unlike benchmarks that evaluate a fixed prompt set, $\textbf{FailureAtlas}$ performs guided exploration in the input space, framing error discovery as a structured search for minimal, failure-inducing concepts. While this is a computationally explosive problem, we make it tractable with novel acceleration techniques. When applied to Stable Diffusion models, our method uncovers hundreds of thousands of previously unknown error slices (e.g., over 247,000 in SD1.5 alone) and provides the first large-scale evidence linking these failures to data scarcity in the training set. By providing a principled and scalable engine for deep model auditing, $\textbf{FailureAtlas}$ establishes a new, diagnostic-first methodology to guide the development of more robust generative AI.