Hierarchical Concept Embedding & Pursuit for Interpretable Image Classification
Abstract
Interpretable-by-design models are gaining traction in computer vision because they provide faithful explanations for their predictions. In image classification, these models typically recover human-interpretable concepts from an image and use them for classification. Sparse concept recovery methods leverage the latent space of vision-language models to represent image embeddings as a sparse combination of concept embeddings. However, because such methods ignore the hierarchical structure of concepts, they can produce correct predictions with explanations that are inconsistent with the hierarchy. In this work, we propose Hierarchical Concept Embedding and Pursuit (HCEP), a framework that induces a hierarchy of concept vectors in the latent space and uses hierarchical sparse coding to recover the concepts present in an image. Given a semantic hierarchy of concepts, we construct a corresponding hierarchy of concept vectors and, assuming the correct concepts for an image form a rooted path in the hierarchy, derive desirable conditions for identifying them in the embedded space. We show that hierarchical sparse coding reliably recovers hierarchical concept vectors, whereas vanilla sparse coding fails. Our experiments demonstrate that HCEP outperforms baselines on real-world datasets in concept precision and recall while maintaining competitive classification accuracy. Moreover, when the number of samples is limited, HCEP achieves superior classification accuracy and concept recovery. These results suggest that incorporating hierarchical structures into sparse coding yields more reliable and interpretable image classification models.