Universal-to-Specific: Dynamic Knowledge-Guided Multiple Instance Learning for Few-Shot Whole Slide Image Classification
Abstract
Multiple Instance Learning (MIL) has emerged as the dominant paradigm for the analysis of gigapixel-scale Whole Slide Images (WSIs). However, recent methods leveraging guidance from Vision-Language Models often rely on static and universal pathological descriptions. This one-size-fits-all strategy fails to account for the vast morphological heterogeneity within individual WSIs, as its uniform guidance is not tailored to slide-specific visual evidence. To address this, we propose DyKo, a \textbf{Dy}namic \textbf{K}n\textbf{o}wledge-guided MIL framework that adapts universal knowledge to slide-specific evidence for few-shot WSI classification. The core of DyKo is the WSI-Adaptive Knowledge Instantiation module (WAKI). WAKI begins by identifying key visual prototypes within a specific WSI's histology. These slide-specific prototypes then serve as queries to retrieve relevant concepts from a pathology knowledge base. This retrieved knowledge is then used to synthesize unique, knowledge-instantiated features for each instance, effectively instantiating tailored guidance at the patch level. To ensure fidelity and prevent semantic drift, we introduce a Structural Consistency loss that enforces alignment between knowledge-instantiated and visual features. Comprehensive experiments on four public real-world cancer datasets demonstrate that DyKo achieves superior performance over state-of-the-art methods in few-shot pathology diagnosis. Code will be made publicly available upon paper acceptance.