Poster Sat, Jun 6, 2026 • 3:45 PM – 5:45 PM PDT ExHall A & F 85

Resolving the Stability-Plasticity Dilemma in Reinforcement Learning via Complementary Continual Critics

Bo Sun ⋅ Peixi Peng ⋅ Guang Tan ⋅ Haoran Xu ⋅ Yaokun Li ⋅ Yiqian Chang ⋅ Shuaixian Wang ⋅ Luntong Li

Paper PDF

Abstract

This paper proposes the Continual Dual-Critic with Cross-Attention (CD-CCA) framework for visual reinforcement learning to address the plasticity-stability conflict. Our method introduces continual learning techniques into the visual RL architecture, constructing two complementary critics using Continual Backpropagation (CBP) and Elastic Weight Consolidation (EWC) -- one for maintaining representational plasticity for rapid environmental adaptation, and the other for preserving knowledge stability to prevent catastrophic forgetting. Furthermore, we design a cross-attention based fusion mechanism that balances the value estimates from the dual critics according to observation characteristics. Experimental results on DeepMind Control and CARLA benchmarks show that CD-CCA effective mitigates issues of representation drift and policy degradation. Compared to existing visual RL methods, our approach exhibits enhanced robustness and adaptability in non-stationary environments and long-horizon decision-making tasks, providing a new architectural paradigm for the advancement of continual reinforcement learning.