Skip to yearly menu bar Skip to main content


Poster

Dynamic Important Example Mining for Reinforcement Finetuning

Haoru Tan ⋅ WU Sitong ⋅ Yanfeng Chen ⋅ Shizhen Zhao ⋅ Yangtian Sun ⋅ Tianjia Liu ⋅ Chirui Chang ⋅ Shaofeng Zhang ⋅ Xingwu Sun ⋅ Xiuzhe Wu ⋅ Ruobing Xie ⋅ Xiaojuan Qi

Abstract

Log in and register to view live content