Paper
in
Workshop: 8th Multimodal Learning and Applications Workshop
Transformer-Based Lung Infection Severity Prediction with Cross Attention and Conditional TransMix Augmentation
Bouthaina Slika · Fadi Dornaika · Fares Bougourzi · Karim Hammoudi
Lung infections, particularly pneumonia, pose significant health risks and can rapidly worsen, especially during pandemics. Developing advanced AI-driven tools for severity prediction based on medical imaging is essential for timely decision-making and treatment, ultimately saving lives. In this study, we introduce a novel approach applicable to multiple medical imaging modalities, including CT scans and chest X-rays, for predicting lung infection severity. Our method consists of two key components: a Transformer-based severity prediction model, and an augmentation strategy called Conditional Online TransMix, designed to address data imbalance. The proposed model employs parallel encoders, integrating Pyramid Vision Transformers (PVTs) with a cross-gated attention mechanism and a feature aggregation module to generate a scalar severity score. To enhance model generalization across datasets, we introduce a tailored augmentation technique that synthesizes new mixed severity scores linked to image patches. We validate our approach using the RALO CXR and Per-COVID-19 CT datasets, demonstrating superior performance on multi-image modalities compared to several state-of-the-art deep learning models. By incorporating a customized weighted loss function, our method enhances the precision of automated lung disease severity assessment, providing a reliable and adaptable AI tool for clinical diagnosis and treatment planning.