MDCS-MoAME: Multi-directional Composite Scanning with Mixture of Attention and Mamba Experts for Cancer Survival Prediction
Abstract
Multi-modal learning approaches that integrate pathological images with genomic profiles have significantly enhanced the accuracy of survival prediction tasks. However, previous methods often struggle to effectively process long-range gigapixel whole slide images (WSIs) and sparse genomic profiles due to the limitations of conventional scanning strategies to serialize data and the complex and heterogeneous nature of the modalities. Inspired by recent advancements in Mamba and mixture of experts (MoE), we propose a novel multi-directional composite scanning strategy with mixture of attention and Mamba experts (MDCS-MoAME) for cancer survival prediction. Specifically, we introduce a multi-directional composite scanning (MDCS) strategy to both WSIs and genomic profiles, and use the Mamba encoder to process intra-modal representations at the region, patch, and gene level, ensuring sufficient utilization of the intrinsic information within each modality. To further capture heterogeneous inter-modal representations, we introduce mixture of attention and Mamba experts (MoAME), which dynamically selects tailored experts to model complex inter-modal correlations, flexibly focusing on the interactions between modalities. Finally, we introduce alignment constraints to recalibrate inter-modal interactions and reduce intra- and inter-modal representation redundancy, enhancing its discriminative power for comprehensive survival analysis. Experimental results on five publicly available datasets demonstrate that our method outperforms existing approaches, achieving state-of-the-art performance. Our code is included in the supplementary material.