Skip to yearly menu bar Skip to main content


Poster

A Unified, Resilient, and Explainable Adversarial Patch Detector

Vishesh Kumar · Akshay Agarwal

ExHall D Poster #406
[ ] [ Project Page ] [ Paper PDF ]
Sun 15 Jun 2 p.m. PDT — 4 p.m. PDT

Abstract:

Deep Neural Networks (DNNs), backbone architecture in `almost' every computer vision task, are vulnerable to adversarial attacks, particularly physical out-of-distribution (OOD) adversarial patches. Existing models often struggle with interpreting these attacks in ways that align with human visual perception. Our proposed AdvPatchXAI introduces a generalized, robust, and explainable defense algorithm specifically designed to defend DNNs against physical adversarial threats. AdvPatchXAI employs a novel patch decorrelation loss that reduces feature redundancy and enhances the distinctiveness of patch representations, enabling better generalization across unseen adversarial scenarios. It learns prototypical parts in a self-supervised fashion, enhancing interpretability and correlation with human vision. The model utilizes a sparse linear layer for classification, making the decision-making process globally interpretable through a set of learned prototypes and locally explainable by pinpointing relevant prototypes within an image. Our comprehensive evaluation shows that AdvPatchXAI not only closes the ``semantic'' gap between latent space and pixel space but also effectively handles unseen adversarial patches even perturbed with unseen corruptions, thereby significantly advancing DNN robustness in practical settings.

Live content is unavailable. Log in and register to view live content