Paper
in
Workshop: 7th Safe Artificial Intelligence for All Domains (SAIAD)
Robustness Evaluation for Video Models with Reinforcement Learning
Ashwin Babu · Seyed Sajad Mousavi · Vineet Gundecha · Sahand Ghorbanpour · Avisek Naug · Ricardo Luna Gutierrez · Antonio Guillen-Perez · Soumyendu Sarkar
Evaluating the robustness of Video classification models is very challenging, specifically when compared to image-based models. With their increased temporal dimension, there is a significant increase in complexity and computational cost. One of the key challenges is to keep the perturbations to a minimum to induce misclassification. In this work, we propose a multi-agent reinforcement learning approach (spatial and temporal) that cooperatively learns to identify the given video's sensitive spatial and temporal region. The agents consider temporal coherence in generating fine perturbations leading to a more effective and visually imperceptible attack. Our method outperforms the state-of-the-art solutions on Lp metric and the average queries. Our method enables custom distortion types, making the robustness evaluation more relevant to the use case. We extensively evaluate 4 popular models for video action recognition on two popular datasets HMDB-51 and UCF-101.