Poster
A Unified, Resilient, and Explainable Adversarial Patch Detector
Vishesh Kumar · Akshay Agarwal
ExHall D Poster #406
Deep Neural Networks (DNNs), backbone architecture in `almost' every computer vision task, are vulnerable to adversarial attacks, particularly physical out-of-distribution (OOD) adversarial patches. Existing models often struggle with interpreting these attacks in ways that align with human visual perception. Our proposed AdvPatchXAI introduces a generalized, robust, and explainable defense algorithm specifically designed to defend DNNs against physical adversarial threats. AdvPatchXAI employs a novel patch decorrelation loss that reduces feature redundancy and enhances the distinctiveness of patch representations, enabling better generalization across unseen adversarial scenarios. It learns prototypical parts in a self-supervised fashion, enhancing interpretability and correlation with human vision. The model utilizes a sparse linear layer for classification, making the decision-making process globally interpretable through a set of learned prototypes and locally explainable by pinpointing relevant prototypes within an image. Our comprehensive evaluation shows that AdvPatchXAI not only closes the ``semantic'' gap between latent space and pixel space but also effectively handles unseen adversarial patches even perturbed with unseen corruptions, thereby significantly advancing DNN robustness in practical settings.
Live content is unavailable. Log in and register to view live content