BiomedCCPL: Causal Conditional Prompt Learning for Biomedical Vision-Language Models
Xueliang Cui ⋅ Juncai Zhang ⋅ Jiacheng Hou ⋅ Dan Lu ⋅ Hao Zhang ⋅ Ruxin Wang
Abstract
Vision-language models (VLMs) have demonstrated strong potential for adapting to downstream biomedical tasks with limited training samples. However, their generalization to unseen classes within the same dataset remains limited, as the image–text alignment semantics often rely on spurious cues present in seen classes that do not transfer.To tackle this, we propose $\textbf{BiomedCCPL (Causal Conditional Prompt Learning)}$, a framework that uses VGAP (Visual Grounder with Adaptive Prototype) to generate image-conditional prompts from multi-scale adaptive prototypes and employs SCD (Synergistic Causal Disentanglement) to regularize the generation of image-conditional prompts.Guided by insights from a causal analysis of generalization to unseen classes, SCD leverages multiple synergistic learning objectives to perform front-door adjustment, ensuring that the dynamically generated image-conditional prompts focus on underlying diagnostic image features shared across seen and unseen classes.Experiments on 11 datasets across 9 modalities demonstrate that BiomedCCPL effectively enhances the model's data efficiency and generalization ability.In particular, on the Base-to-Novel task, BiomedCCPL achieves an average HM of 79.98\%, surpassing the previous state-of-the-art by 6.45\%.
Successful Page Load