Skip to yearly menu bar Skip to main content


Poster

SegMo: Co-Designing Content-Aware Sparsity and Locally-Cohesive Segment Parallelism for Efficient VLM Inference

Haojuan Li ⋅ Ruohan Tang ⋅ Dongzhou Cheng ⋅ Zongpu Zhang ⋅ Jian Li ⋅ Jiaqi Wang

Abstract

Log in and register to view live content