GPFlow: Gaussian Prototype Probability Flow for Unsupervised Multi-Modal Anomaly Detection
Abstract
We address unsupervised multi-modal anomaly detection (MAD) in few-shot regimes, where only a handful of normal exemplars are available per class. Existing approaches struggle with such data scarcity due to their incapacity in capturing the distribution-level information of normal appearance and geometry. To capture diverse and continuous normality variations, we propose GPFlow, a probability flow inspired framework that embeds diverse normal patterns into a latent space of learnable Gaussian prototypes. At its core, GPFlow uses an analytical Posterior‑Mean Path (PMP) router that iteratively moves features toward prototype‑centered high‑probability neighborhoods, acting as an explicit information bottleneck to prevent trivial reconstruction of anomalies. To exploit multi-modal cues, GPFlow employs a coupled reconstruction architecture enforces both intra- and cross-modal consistency at the prototype level. Finally, to handle distribution shift between sparse training samples and unseen test samples, GPFlow incorporates inference-aware prototype refinement to dynamically expand the prototypes' coverage to new normal variations during test time. Extensive experiments on MVTec‑3D‑AD and Eyecandies show that GPFlow achieves state‑of‑the‑art performance with only a few normal training samples, while remaining computationally efficient.