MedLIME: A Distribution-Aligned and Evidence-Supported Framework for Medical Saliency Explanations
Abstract
Saliency-based explainability methods are widely used to interpret deep learning models in medical imaging, yet many existing approaches rely on white box access of models, which is not always possible due to privacy concerns. In this work, we introduce MedLIME, a novel, model-agnostic explanation framework designed to enhance the robustness and fidelity of saliency maps for medical imaging abnormality localization. Building upon the Local Interpretable Model-agnostic Explanations (LIME) paradigm, MedLIME integrates three key components: (1) Generative Masking (GM), (2) Supervised Test-Time Adaptation (STT) and (3) a Evidence-based Regularization (EBR) to improve the saliency map generation accuracy of LIME. Extensive experiments on multiple medical datasets, across three model architectures demonstrate that MedLIME consistently outperforms gradient-based and perturbation-based baselines in abnormality localization as measured by AUPRC. Our results highlight that incorporating generative reconstruction, adaptive perturbation and data-driven regularization improves the reliability and interpretability of medical imaging models.