CVPR Tutorial Evaluating Large Multi-modal Models: Challenges and Methods

Tutorial

Evaluating Large Multi-modal Models: Challenges and Methods

Kaijie Zhu

[ Abstract ] [ Project Page ]

Wed 11 Jun 11 a.m. PDT — 3 p.m. PDT

Abstract:

The proliferation of large multi-modal models (LMMs) has raised increasing concerns about their security and risks, which are mainly due to a lack of understanding of their capabilities and limitations. In this tutorial, our aim is to fill this gap by presenting a holistic overview of LMM evaluation. First, we discuss the recent advance of LMMs evaluation from the perspectives of what, where, and how to evaluate. Then, we present several key challenges in LMM evaluation such as data contamination and fixed complexity. Based on these challenges, we introduce how to overcome these challenges. Furthermore, our discussion covers key evaluation metrics including trustworthiness, robustness, and fairness, as well as performance across diverse downstream tasks in natural and social sciences. We conclude with an overview of widely-used code libraries and benchmarks that support these evaluation efforts. We hope that academic and industrial researchers continue to work to make LMMs more secure, responsible, and accurate.

Chat is not available.