Skip to yearly menu bar Skip to main content


Poster

What Do Visual Tokens Really Encode? Uncovering Sparsity and Redundancy in Multimodal Large Language Models

Yingqi Fan ⋅ Junlong Tong ⋅ Anhao Zhao ⋅ Xiaoyu Shen

Abstract

Log in and register to view live content