Skip to yearly menu bar Skip to main content


Poster

See, Think, Act: Teaching Multimodal Agents to Effectively Interact with GUI by Identifying Toggles

Zongru Wu ⋅ Rui Mao ⋅ Zhiyuan Tian ⋅ Pengzhou Cheng ⋅ Tianjie Ju ⋅ Zheng Wu ⋅ Lingzhong Dong ⋅ Haiyue Sheng ⋅ Zhuosheng Zhang ⋅ Gongshen Liu

Abstract

Log in and register to view live content