Poster
Finer-CAM: Spotting the Difference Reveals Finer Details for Visual Explanation
Ziheng Zhang · Jianyang Gu · Arpita Chowdhury · Zheda Mai · David Carlyn · Tanya Berger-Wolf · Yu Su · Wei-Lun Chao
Class activation map (CAM) has been broadly investigated to highlight image regions that contribute to class predictions. Despite its simplicity to implement and computational efficiency, CAM often struggles to identify discriminative regions in fine-grained classification tasks. Previous efforts address this by introducing more sophisticated explanation processes on the model prediction, at a cost of extra complexity. In this paper, we propose Finer-CAM, which retains CAM's efficiency while achieving precise localization of discriminative regions. Our insight is that the deficiency of CAM is not about how it explains but what it explains. Previous methods independently look at all the possible hints that explain the target class prediction, which inevitably also activates regions predictive of similar classes. By explicitly comparing and spotting the difference between the target class and similar classes, Finer-CAM suppresses features shared with other classes, and emphasizes the unique key details of the target class. The method allows adjustable comparing strength, enabling Finer-CAM to focus coarsely on general object contours or discriminatively on fine-grained details. The method is compatible with various CAM methods, and can be extended to multi-modal models to accurately activate specific concepts. Quantitatively, we demonstrate that masking out the top 5% activated pixels by Finer-CAM leads to a larger relative confidence drop compared with baselines. The source code is attached in the supplementary material.
Live content is unavailable. Log in and register to view live content