Poster
M3amba: Memory Mamba is All You Need for Whole Slide Image Classification
Tingting Zheng · Kui Jiang · Yi Xiao · Sicheng Zhao · Hongxun Yao
Multi-instance learning (MIL) has demonstrated impressive performance in whole slide image (WSI) analysis. However, existing approaches struggle with undesirable results and unbearable computational overhead due to the quadratic complexity of Transformers. Recently, Mamba has offered a feasible solution for modeling long-range dependencies with linear complexity. However, vanilla Mamba inherently suffers from contextual forgetting issues, making it ill-suited for capturing global dependencies across instances in large-scale WSIs. To address this, we propose a memory-driven Mamba network, dubbed M3amba, to fully explore the global latent relations among instances. Specifically, M3amba retains and iteratively updates historical information with a dynamic memory bank (DMB), thus overcoming the catastrophic forgetting defects of Mamba for long-term context representation. For better feature representation, M3amba involves an intra-group bidirectional Mamba (BiMamba) block to refine local interactions within groups. Meanwhile, we additionally perform cross-attention fusion to incorporate relevant historical information across groups, facilitating richer inter-group connections. The joint learning of inter- and intra-group representations with memory merits enables M3amba with a more powerful capability for achieving accurate and comprehensive WSI representation. Extensive experiments on four datasets demonstrate that M3amba outperforms the state-of-the-art by 6.2\% and 7.0\% in accuracy on the TCGA BRAC and TCGA Lung datasets while maintaining low computational costs.
Live content is unavailable. Log in and register to view live content