Active Perceptual Inference: A Corticothalamic-Inspired Dynamic Nested Recurrent Network for Multimodal Sentiment Analysis with Incomplete Data
Abstract
Random frame-level data missing is a critical challenge in multimodal sentiment analysis. Existing methods are largely limited to passive completion via single-pass feedforward connections and static cross-modal fusion, which struggle to generate high-quality completed features. However, the brain is not a passive recipient of external information but a dynamic system for active perceptual inference. Its core lies in the dynamic nested recurrents formed by intra-cortical recurrent completion mechanisms and corticothalamic circuits, which iteratively perform perceptual inference. Inspired by this, we propose the Dynamic Nested Recurrent Network (DNRNet). It is the first to introduce recurrent inference into the data completion task, achieving a paradigm shift from passive completion to active perceptual inference. Its local recurrent loop simulates intra-cortical recurrent pattern completion to perform perceptual inference and generate local correction features. The global recurrent loop simulates the modulatory function of the thalamus, calculating modality confidence to dynamically weight and integrate cross-modal information, generating global correction features. The local and global correction features are fused to obtain the completion signal, which is then combined with the input features of the current iteration to serve as the input for the next iteration. Experiments on the MOSI, MOSEI, and SIMS datasets demonstrate that DNRNet achieves an average accuracy improvement of 1.5%–2.0% over baseline models across all missing rates, validating the superiority of the brain-inspired approach in complex missing data scenarios.