Workshop
Workshop on Video Large Language Models
Mubarak Shah · Larry S. Davis · Rene Vidal · Son Dinh Tran · Angela Yao · Salman Khan · Rita Cucchiara · Cees G. M. Snoek · Christoph Feichtenhofer · Chang Xu · Jayakrishnan Unnikrishnan · Afshin Dehghan · Mamshad Nayeem Rizve · Rohit Gupta · Swetha Sirnam · Ashmal Vayani · Omkar Thawakar · Muhammad Uzair Khattak · Dmitry Demidov
Grand A1
Wed 11 Jun, 2 p.m. PDT
Keywords: Foundation models
This workshop will explore the evolution, applications, and challenges of Video Large Language Models (VidLLMs), the latest advancement in multimodal LLMs. It will feature keynote talks from leading researchers, a panel discussion comparing VidLLMs with expert models, and a poster session. The workshop also includes three challenge tracks designed to evaluate VidLLMs' capabilities in compositional video retrieval, complex video reasoning and robustness, and multilingual video reasoning. These tracks aim to address key research areas such as training VidLLMs, their application in specialized computer vision tasks, and the challenges in evaluating their performance. Potential topics for invited papers include VidLLM methods/algorithms, data creation, evaluation and analysis, best practices, applications, and limitations, risks and safety.
Live content is unavailable. Log in and register to view live content