Skip to yearly menu bar Skip to main content


Poster

MSRL: Scaling Generative Multimodal Reward Modeling via Multi-Stage Reinforcement Learning

Chenglong Wang ⋅ Yifu Huo ⋅ Yang Gan ⋅ Qiaozhi He ⋅ Qi Meng ⋅ Bei Li ⋅ Yan Wang ⋅ Junfu Liu ⋅ Tianjua Zhou ⋅ JingBo Zhu ⋅ Tong Xiao

Abstract

Log in and register to view live content