Towards Dynamic Modality Alignment in Multimodal Continual Learning
Abstract
Multimodal Continual Learning (MMCL) aims to enable models to continuously accumulate knowledge across multiple tasks and modalities without forgetting prior information. MMCL presents more challenges than single-modal continual learning, as it requires effective cooperation and complementarity between modalities. Existing methods often treat modality alignment as a static process, assuming once alignment is established, it remains fixed. However, we argue that modality alignment is inherently dynamic, evolving with task learning and feature propagation across layers. To address this, we introduce Dynamic Alignment Graph Regularization (DAGR), a novel approach that explicitly models the evolving alignment across layers. By incorporating multi-level graph regularization, our method stabilizes the alignment process and mitigates catastrophic forgetting. Extensive experiments on benchmarks, such as MTIL, show that DAGR outperforms static alignment-based methods and other continual learning techniques, achieving superior stability.