Skip to yearly menu bar Skip to main content


Poster

VecAttention: Vector-wise Sparse Attention for Accelerating Long Context Inference

Anmin Liu ⋅ Ruixuan Yang ⋅ Huiqiang Jiang ⋅ Bin Lin ⋅ Minmin Sun ⋅ Yong Li ⋅ CHEN ZHANG ⋅ Tao Xie

Abstract

Log in and register to view live content