Skip to yearly menu bar Skip to main content


Poster

Multi-modal Medical Diagnosis via Large-small Model Collaboration

Wanyi Chen · Zihua Zhao · Jiangchao Yao · Ya Zhang · Jiajun Bu · Haishuai Wang


Abstract:

Recent advances in medical AI have shown a clear trend towards large models in healthcare. However, developing large models for multi-modal medical diagnosis remains challenging due to a lack of sufficient modal-complete medical data. Most existing multi-modal diagnostic models are relatively small and struggle with limited feature extraction capabilities. To bridge this gap, we propose AdaCoMed, an adaptive collaborative-learning framework that synergistically integrates the off-the-shelf medical single-modal large models with multi-modal small models. Our framework first employs a mixture-of-modality-experts (MoME) architecture to combine features extracted from multiple single-modal medical large models, and then introduces a novel adaptive co-learning mechanism to collaborate with a multi-modal small model. This co-learning mechanism, guided by an adaptive weighting strategy, dynamically balances the complementary strengths between the MoME-fused large model features and the cross-modal reasoning capabilities of the small model. Extensive experiments on two representative multi-modal medical datasets (MIMIC-IV-MM and MMIST ccRCC) across six modalities and four diagnostic tasks demonstrate consistent improvements over state-of-the-art baselines, making it a promising solution for real-world medical diagnosis applications.

Live content is unavailable. Log in and register to view live content