Skip to yearly menu bar Skip to main content


Workshop

Workshop on Video Large Language Models

Mubarak Shah · Larry S. Davis · Rene Vidal · Son Dinh Tran · Angela Yao · Salman Khan · Rita Cucchiara · Cees G. M. Snoek · Christoph Feichtenhofer · Chang Xu · Jayakrishnan Unnikrishnan · Afshin Dehghan · Mamshad Nayeem Rizve · Rohit Gupta · Swetha Sirnam · Ashmal Vayani · Omkar Thawakar · Muhammad Uzair Khattak · Dmitry Demidov

Grand A1

Wed 11 Jun, 2 p.m. PDT

Keywords:  Foundation models  

This workshop will explore the evolution, applications, and challenges of Video Large Language Models (VidLLMs), the latest advancement in multimodal LLMs. It will feature keynote talks from leading researchers, a panel discussion comparing VidLLMs with expert models, and a poster session. The workshop also includes three challenge tracks designed to evaluate VidLLMs' capabilities in compositional video retrieval, complex video reasoning and robustness, and multilingual video reasoning. These tracks aim to address key research areas such as training VidLLMs, their application in specialized computer vision tasks, and the challenges in evaluating their performance. Potential topics for invited papers include VidLLM methods/algorithms, data creation, evaluation and analysis, best practices, applications, and limitations, risks and safety.  

Live content is unavailable. Log in and register to view live content