Skip to yearly menu bar Skip to main content


Poster

See What I Mean: Aligning Vision and Language Representations for Video Fine-grained Object Understanding

Bo-Yuan Sun ⋅ Bowen Yin ⋅ Yuanming Li ⋅ Xihan Wei ⋅ Qibin Hou

Abstract

Log in and register to view live content