Skip to yearly menu bar Skip to main content


Prompt-Enhanced Multiple Instance Learning for Weakly Supervised Video Anomaly Detection

Junxi Chen · Liang Li · Li Su · Zheng-Jun Zha · Qingming Huang

Arch 4A-E Poster #359
[ ]
Thu 20 Jun 5 p.m. PDT — 6:30 p.m. PDT


Weakly-supervised Video Anomaly Detection (wVAD) aims to detect frame-level anomalies using only video-level labels in training. Due to the limitation of weak-label, Multi-Instance Learning (MIL) is prevailing in wVAD which leverages binary labels. However, MIL suffers from insufficiency of binary supervision to model diverse abnormal patterns. Besides, the coupling between abnormality and its context hinders learning of clear abnormal event boundary. In this paper, we propose prompt-enhanced multi-instance learning, which is devised to detect various abnormal events while ensuring a clear event boundary. Concretely, we first design the abnormal-aware prompts by using abnormal class annotations together with trainable prompts, which can incorporate semantic priors into video features dynamically. The detector can utilize the semantic-rich features to capture diverse abnormal patterns. In addition, normal context prompt is introduced to amplify the distinction between abnormality and abnormal context, facilitating the generation of clear boundary. With the mutual enhancement of abnormal-aware and normal context prompts, the model can construct a discriminative representation to detect divergent anomalies without ambiguous event boundary. Extensive experiments demonstrate our method achieves SOTA performance on three public benchmarks.

Live content is unavailable. Log in and register to view live content