VSI: Visual–Subtitle Integration for Keyframe Selection to Enhance Long Video Understanding
Jianxiang He, Meisheng Hong, Jungang Li, Weiyu Guo, Xuming Hu, Hui Xiong
Keywords:
Vision, Language, and Reasoning
Successful Page Load