Skip to yearly menu bar Skip to main content


Oral

Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

Christopher Clark ⋅ Jieyu Zhang ⋅ Zixian Ma ⋅ Jae Sung Park ⋅ Rohun Tripathi ⋅ Sangho Lee ⋅ Reza Salehi ⋅ Jason Ren ⋅ Chris Dongjoo Kim ⋅ Yinuo Yang ⋅ Vincent Shao ⋅ Yue Yang ⋅ Weikai Huang ⋅ Ziqi Gao ⋅ Taira Anderson ⋅ Jianrui Zhang ⋅ Jitesh Jain ⋅ George Stoica ⋅ Ali Farhadi ⋅ Ranjay Krishna

Abstract

Log in and register to view live content