Unpaired Image Deraining Using Reward-Guided Self-Reinforcement Strategy
Abstract
Unsupervised deraining has attracted increasing attention due to its flexible data requirements during model training. Lacking paired supervision makes it challenging for the network to achieve a compact optimization space within complex and diversity rain degradation data. Additionally, some high-quality deraining results produced during the network’s training process are overlooked, despite their potential to constrain the optimization space. To overcome them, we introduce a Reward-Guided Self-reinforcement Unsupervised Image Deraining framework, RGSUD. Our RGSUD consists of two stages: rewards recycling and self-reinforcement (SR) strategy training. For the former, we propose a Vision Language Model (VLM) based dynamic reward recycling mechanism to select the optimal deraining results from outputs during model training. In this way, we can robustly collect high-quality deraining results. For the latter, reward-driven optimization is adopted to construct the connection between the rewards and current deraining result, which constrains the optimization space of RGSUD. Thus, the network can learn deraining knowledge within a more compact optimization space, further enhancing deraining performance. The proposed SR strategy achieves over 1 dB improvement on Rain100L and real-world dataset RealRain1K-L, compared to the baseline. Extensive experiments on multiple datasets demonstrate that our proposed framework performs favorably over state-of-the-art unsupervised deraining methods.