Conflict-Aware Adaptive Cross-Reconstruction for Multimodal Sentiment Analysis
Abstract
Disentanglement-based methods for learning shared representations are widely used in multimodal sentiment analysis. However, most of them adopt intra-modal reconstruction strategy and rely on similarity losses to align shared representations, while often ignoring potential emotional conflicts across modalities within the same sample, thereby distorting the shared semantics. To address these issues, we propose a Conflict-aware Adaptive Cross-Reconstruction approach (CACR). First, we formally define emotional conflict and design a conflict-aware weighting strategy. This strategy calculates sample-level conflict scores based on modality consistency metrics and maps them into dynamic weights for the cross-reconstruction loss of each modality. Second, based on this, we construct a cross-reconstruction module: for each modality, reconstructs its representation by leveraging its own specific features and the shared features of the other modalities, adaptively weighting each cross-reconstruction term with the aforementioned weights, thereby achieving implicit alignment of shared representations while mitigating semantic ambiguity. Extensive experiments on three widely used benchmarks show that CACR outperforms existing state-of-the-art methods on six evaluation metrics, demonstrating its effectiveness in handling modality-level emotional conflict.