Skip to yearly menu bar Skip to main content


Poster

Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient Decoding

Fatih Ilhan ⋅ Gaowen Liu ⋅ Ramana Kompella ⋅ Selim Tekin ⋅ Tiansheng Huang ⋅ Zachary Yahn ⋅ Yichang Xu ⋅ Ling Liu

Abstract

Log in and register to view live content