Paper
in
Workshop: Domain Generalization: Evolution, Breakthroughs, and Future Horizons
ARDGen: Augmentation Regularization for Domain-Generalized Medical Report Generation
Syed Bilal Ahsan · Muhammad Ikhalas · Muhammad Muzamil Khan · Sana Ullah
Automated medical report generation from chest radiographs is pivotal for clinical decision support, yet existing systems suffer from performance degradation due to domain shifts across diverse imaging sources. In this work, we propose a multi-modal framework that robustly generates clinically relevant diagnostic reports by integrating visual and textual modalities. Our model comprises an image classification branch employing a pre-trained ResNet-based encoder with advanced image augmentation and consistency regularization and a report generation branch featuring dual BERT-based decoders. The primary text decoder produces the diagnostic narrative while an Augmentation Regularization Decoder (ARD), used exclusively during training, serves as a regularizer to enhance the model's adaptability. We further enforce text-level consistency through augmentation-driven losses. Extensive experiments conducted on the MIMIC-CXR and IU-Xray datasets demonstrate that our approach significantly outperforms existing methods, achieving superior generalization and improved report quality on unseen data. This framework offers a scalable and robust solution for reliable automated diagnosis, bridging the gap between visual evidence and accurate clinical narratives.