Skip to yearly menu bar Skip to main content


Timezone: America/Denver
Filter Events
Tutorial

Edge AI in Action: Mastering On-Device Inference

Fabricio Batista Narcizo · Elizabete Munzlinger · Sai Narsi Reddy Donthi Reddy · Shan Ahmed Shaffi
9:00 AM - 1:00 PM
Tutorial

The Principles of Diffusion Models: Real-Time Continuous & Discrete Diffusion

Chieh-Hsin Lai · Subham Sahoo · Dongjun Kim · Yang Song · Yuki Mitsufuji · Stefano Ermon
9:00 AM - 1:00 PM

Overview

We present a concise, hands-on tutorial on fast diffusion-based generation across continuous and discrete data, featuring live demos that attendees can readily adapt for their own research.

Continuous Diffusion

The first part is based on The Principles of Diffusion Models, which unifies diffusion through variational-basedscore-based, and flow-based viewpoints, then focuses on efficiency: ODE samplers (Euler/Heun-type), distillation of pretrained diffusion models into few-step generators (e.g., DMD), and flow-map alternativesincluding Consistency ModelsConsistency Trajectory Models, and MeanFlow. We focus on first principles, together with practical training recipes and live demos.

Discrete Diffusion

The second part focuses on discrete diffusion. We introduce its core theoretical foundations, with emphasis on Diffusion Duality, which shows how discrete diffusion processes can emerge from Gaussian diffusion and provides a principled way to design discrete analogues of continuous-space methods. Building on this framework, we present Discrete Consistency Distillation for few-step generation in discrete diffusion models, and walk through its training and practical implementation. We conclude by exploring two families of samplers: those enabling few-step generation and those supporting inference-time scaling.

The tutorial is intended for participants familiar with neural networks and PyTorch, with some background in classic generative modeling concepts.

Click here for the detailed schedule.

... more
Tutorial
9:00 AM - 1:00 PM

Diffusion models and flow-based methods have revolutionized generative learning in the visual domain, setting new standards for image, video, and 3D content creation. However, as the field shifts toward interactive applications—such as real-time editing, world models, and embodied AI—the need for low-latency feedback has become critical. Currently, the high computational cost of iterative sampling hinders real-world deployment. While various acceleration techniques exist, the lack of a unified resource makes it difficult to bridge the gap between theory and practice.

... more
Tutorial

As deep learning systems are increasingly deployed in high-stakes applications, understanding their behavior is critical for ensuring trust and safety. Interpretability provides essential tools to explain, debug, and improve these models. However, the field remains fragmented, spanning a wide range of methods and assumptions, while lacking standardized evaluation protocols. This tutorial aims to provide aunified overview of interpretability in deep learning– bridging post-hoc mechanistic understanding and methods to design inherently interpretable deep learning models.By the end of this tutorial, attendees will gain a solid understanding ofmodern interpretability methodsfor deep learning models, how torigorously evaluatethem, and open research directions in this critical area.

... more
Workshop

Workshop on Autonomous Driving

Vincent Casser ⋅ Jose M. Alvarez
9:00 AM - 6:00 PM
Workshop

LatinX in Computer Vision Research Workshop

Francisco Lopez-Tiro ⋅ Dustin Carrión-Ojeda ⋅ Willams De Lima ⋅ Hernan Dario Benitez ⋅ Ana Maria Quintero ⋅ William de Lima
9:00 AM - 1:00 PM
Workshop
Workshop
9:00 AM - 6:00 PM
Workshop

Workshop on "Bitter Lessons"

Anand Bhattad ⋅ Aditya Prakash
9:00 AM - 1:00 PM
Workshop

IPA: Interactive Physical AI Workshop

Seonwook Park ⋅ Amrita Mazumdar
9:00 AM - 1:00 PM
Workshop
Workshop
9:00 AM - 1:00 PM
Workshop
9:00 AM - 1:00 PM
Workshop

2nd Workshop on Photorealistic 3D Head Avatars

Tobias Kirschstein ⋅ Simon Giebenhain
9:00 AM - 1:00 PM
Workshop

PHAROS AI Factory for Medical Imaging & Healthcare

Stefanos Kollias ⋅ Xujiong Ye
9:00 AM - 1:00 PM
Workshop

Computational Cameras and Displays

Vishwanath Saragadam ⋅ Fei Xia
9:00 AM - 6:00 PM
Workshop

Workshop on Vision-based Assistants in the Real-World

Apratim Bhattacharyya ⋅ Fadime Sener
9:00 AM - 1:00 PM
Workshop
9:00 AM - 6:00 PM
Workshop
Workshop
Workshop
9:00 AM - 1:00 PM

Computer vision for high-stakes, real-world applications necessitates robust explanation and transparency to ensure trust, accountability, and ethical deployment. Celebrating its 5th Anniversary, the Explainable AI for Computer Vision (XAI4CV) workshop provides a premier forum for the entire spectrum of XAI research, from interpretable-by-design models to challenges in multimodal foundational models. The program includes invited talks, spotlight papers, a poster session, and a tutorial. XAI4CV accepts paper and demo submissions to define the future of trustworthy visual AI.

... more
Workshop

Third Joint Egocentric Vision (EgoVis) Workshop

Siddhant Bansal ⋅ Tushar Nagarajan
9:00 AM - 6:00 PM
Workshop

Visual General Intelligence

Hirokatsu Kataoka ⋅ Yoshihiro Fukuhara
9:00 AM - 6:00 PM
Workshop
9:00 AM - 1:00 PM
Workshop
Workshop
9:00 AM - 6:00 PM
Workshop

The 5th DataCV Workshop and Challenge

Liang Zheng ⋅ Yue Yao
9:00 AM - 1:00 PM
Workshop

Women in Computer Vision

Karen Sanchez ⋅ Carla Muntean
9:00 AM - 1:00 PM
Workshop

Efficient Deep Learning for Computer Vision

Shuai Zhang ⋅ Yung-Hsiang Lu
9:00 AM - 6:00 PM
Workshop
Workshop
Workshop
9:00 AM - 1:00 PM
Workshop

Foundation Models for Autonomous Driving

Walter Zimmer ⋅ Rui Song
9:00 AM - 6:00 PM
Workshop

The 22th Embedded Vision Workshop

Matteo Poggi ⋅ Tse-Wei Chen
9:00 AM - 1:00 PM
Workshop
9:00 AM - 1:00 PM
Workshop
9:00 AM - 1:00 PM
Workshop
9:00 AM - 6:00 PM
Workshop
9:00 AM - 1:00 PM
Workshop
9:00 AM - 6:00 PM
Workshop

The 3rd AI for Visual Arts Workshop and Challenges

Deblina Bhattacharjee ⋅ Bahar Aydemir
9:00 AM - 1:00 PM
Workshop

Multimodal Alignment for a Pluralistic Society

Perampalli Shravan Nayak ⋅ Aishwarya Agrawal
9:00 AM - 1:00 PM
Workshop

On Sensor Vision Workshop

Andrew J. Davison ⋅ Shinjeong Kim
9:00 AM - 1:00 PM
Workshop
9:00 AM - 6:00 PM
Workshop

Generative AI for Sign Language

Hezhen Hu ⋅ Yuecong Min
9:00 AM - 1:00 PM
Workshop

Generative AI for XR and Identity-based Applications

Brendan David-John ⋅ Chris Thomas
9:00 AM - 1:00 PM
Workshop

AI for Content Creation

James Tompkin ⋅ Krishna Kumar Singh
9:00 AM - 1:00 PM
Tutorial

From Perception to Simulation: The Emergence of World Models in Multi-modal Reasoning

Yujun Cai · Jianfei Cai · Yiwei Wang · Ming-Hsuan Yang
1:00 PM - 6:00 PM

World models are rapidly reshaping artificial intelligence, evolving from systems that passively perceive the world into engines capable of simulating, reasoning, and planning within it. This tutorial examines how recent advances in generative modeling, self-supervised learning, and multimodal architectures are enabling machines to move beyond recognition and prediction toward mental simulation, counterfactual reasoning, and decision making.

    We will explore the foundations of world models, approaches for learning dynamics from visual and multimodal data, and the integration of planning and reasoning. The tutorial highlights connections between video generation, diffusion models, discrete representations, and embodied AI, while addressing key challenges such as grounding, causality, physical consistency, and evaluation.

    Designed for researchers, practitioners, and students, this session provides both conceptual insights and practical perspectives on building AI systems that reason about environments rather than merely interpreting them.
... more
Tutorial

Building GenAI based Simulation Environment for End-to-End Autonomous Driving

Henry Liu · Howie Sun · Jun Gao · Shuo Feng · Xintao Yan · Jiawei Wang
1:00 PM - 6:00 PM

End-to-end autonomous driving systems require simulation environments capable of exposing models to diverse, realistic, and safety-critical long-tail events that rarely appear in real-world data. Traditional simulators—relying on scripted scenarios, simplified traffic logic, and static 3D assets—capture only a narrow slice of real traffic complexity and fail to exercise modern data-driven AV stacks in a meaningful, system-level manner. As end-to-end policies blur the boundaries between perception, prediction, and planning, new generative, data-first, closed-loop simulation workflows are needed to bridge the gap between real-world distributions and synthetic environments.This tutorial aims to demonstrate how generative AI and world models can build end-to-end simulation pipelines that directly support learning-based AV systems. We focus on practical, reproducible methods involving city-scale digital twins, data-driven traffic behavior models, generative corner-case synthesis, and sensor-level simulation tailored for perception and end-to-end policies. Participants will gain both conceptual understanding and hands-on entry points—code, tools, datasets, and minimal templates—to design or extend their own generative simulation systems.This tutorial walks through the complete pipeline for generative end-to-end AV simulation. We begin by defining what distinguishes end-to-end simulation from classical AV simulators and how policy-driven requirements reshape simulation design. We then introduce world modeling and city-scale digital twins, covering data-driven reconstruction of road layouts, traffic rules, and naturalistic human driving behavior. Next, we discuss generative modeling of rare and adversarial scenarios derived from crash reports, regulations, or textual descriptions. We follow with sensor and video simulation, comparing graphics engines, neural rendering, and video foundation models for producing realistic, multi-view, and temporally consistent sensor data. Finally, we integrate these components into a full pipeline and discuss system-level evaluation, failure analysis, and open challenges in validating generative simulation and aligning with safety standards.SpeakersTo be announced.ScheduleTitleSpeakerTimeIntroduction & MotivationTBDTBDModule 1: World Modeling & Digital TwinsTBDTBDModule 2: Generative Corner-Case & Scenario SynthesisTBDTBDBreak-TBDModule 3: Sensor Simulation, Video Generation & End-to-End PipelinesTBDTBDModule 4: Testing Open-Source AV StacksTBDTBDClosing DiscussionTBDTBDOrganizersHenry LiuUniversity of MichiganHowie SunSaferDrive AIJun GaoUniversity of Michigan / NVIDIAShuo FengTsinghua UniversityXintao YanUniversity of Hong KongJiawei WangUniversity of MichiganRelated Publications & ResourcesFeng, S., Sun, H., Yan, X., Zhu, H., Zou, Z., Shen, S., & Liu, H. X. (2023). Dense reinforcement learning for safety validation of autonomous vehicles.Nature, 615(7953), 620–627.Liu, H. X., & Feng, S. (2024). Curse of rarity for autonomous vehicles.Nature Communications, 15(1), 4808.Yan, X., Zou, Z., Feng, S., Zhu, H., Sun, H., & Liu, H. X. (2023). Learning naturalistic driving environment with statistical realism.Nature Communications, 14(1), 2037.Feng, S., Yan, X., Sun, H., Feng, Y., & Liu, H. X. (2021). Intelligent driving intelligence test for autonomous vehicles with naturalistic and adversarial environment.Nature Communications, 12(1), 748.Sun, H., Yan, X., Qiao, Z., Zhu, H., Sun, Y., Wang, J., ... & Liu, H. X. (2025). TeraSim: Uncovering unknown unsafe events for autonomous vehicles through generative simulation.arXiv preprint arXiv:2503.03629.Wang, J., Sun, H., Yan, X., Feng, S., Gao, J., & Liu, H.X. (2025). TeraSim-World: Worldwide safety-critical data synthesis for end-to-end autonomous driving.arXiv preprint arXiv:2509.13164.Ren, X., Lu, Y., Cao, T., Gao, R., Huang, S., Sabour, A., ... & Ling, H. (2025). Cosmos-Drive-Dreams: Scalable synthetic driving data generation with world foundation models.arXiv preprint arXiv:2506.09042.TeraSim:https://github.com/mcity/TeraSimCosmos-Drive:https://github.com/nv-tlabs/Cosmos-Drive-Dreams

... more
Tutorial
1:00 PM - 6:00 PM
Tutorial

Monte Carlo physical simulation

Rohan Sawhney · Bailey Miller · Ioannis Gkioulekas · Keenan Crane
1:00 PM - 6:00 PM

Abstract: Accurately analyzing large amounts of geometric data is critical for many scientific and engineering applications. Techniques based onpartial differential equations (PDEs)provide powerful tools for analyzing physical systems, but conventional solvers are not at a stage where they “just work” on problems of real-world complexity. A constant challenge is spatial discretization, which divides the domain into a high-quality volumetric mesh or background grid for PDE-based analysis. Unfortunately, this approach does not scale well to modern computer architectures, and as such, there remains a large divide between our ability tovisualizeandanalyzethe natural world.

... more
Tutorial

3D Human Mesh Modeling and Recovery from RGB and LiDAR

Romain Bregier · Istvan Sarandi · Salma Galaaoui · Fabien Baradel · Nermin Samet · David Picard
1:00 PM - 6:00 PM

The understanding of human pose and shape is the cornerstone of multiple AI applications ranging from monitoring, AR/VR, sport and posture analysis, human-robot interaction all the way to autonomous driving. Accurate human perception enables digital systems to interact appropriately with people in both indoor and outdoor environments.Recent advances have pushed the field forward: modern methods now begin to achieve strong in-the-wild Human Mesh Recovery (HMR) performance, making them more reliable and useful for a wide variety of downstream tasks. With this growing interest, the community has seen the emergence of datasets and shape-recovery models, as well as an expanding range of input modalities; including RGB, depth, LiDAR, etc. At the same time, multiple human body models are being developed, each offering different levels of detail, interpretability and expressivity.While these developments open up exciting new opportunities, they also introduce new challenges. Designing and deploying human mesh recovery systems remains difficult due to dependency on the chosen body model, peculiarities of single-person and multi-person settings, challenges of occlusions and interactions with the 3D scene, and the reliance on data-hungry training pipelines.This tutorial is therefore motivated by the need for a clear, structured, and accessible overview of the current HMR landscape. The increasing use of foundation models and large-scale pretrained systems makes it particularly timely to disseminate a clear picture of the underlying principles of human body modeling and HMR, so that these methods can be more easily adopted, extended, and applied to adjacent fields beyond core human pose estimation. Our goal is to lower the entry barrier for newcomers, provide a unifying perspective for practitioners, and foster collaboration between communities working on human modeling, 3D vision, graphics, and embodied AI. By providing access to these concepts, we aim to maximize the impact of recent advances and encourage their use in downstream applications.

... more
Workshop
1:00 PM - 6:00 PM
Workshop

Cognitive Foundations for Multimodal Models

Aditya Chinchure ⋅ Sahithya Ravi
1:00 PM - 6:00 PM
Workshop

10th Affective & Behavior Analysis in-the-wild

Dimitrios Kollias ⋅ Panagiotis Tzirakis
1:00 PM - 6:00 PM
Workshop

Computer Vision for Biomechanics Workshop

Ethan Goan ⋅ Akila Pemasiri
1:00 PM - 6:00 PM
Workshop

Authenticity & Provenance in the age of Generative AI

Shruti Agarwal ⋅ Sarah Barrington
1:00 PM - 6:00 PM
Workshop

Computer Vision for the Built World

Iro Armeni ⋅ Fuxin Li
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop

Synthetic & Adversarial ForEnsics

Josué Martínez-Martínez ⋅ Pooya Khorrami
1:00 PM - 6:00 PM
Workshop

Spatial Intelligence for Cultural Heritage

Marina Paolanti ⋅ Roberto Pierdicca
1:00 PM - 6:00 PM
Workshop

Sixth Workshop on Neural Architecture Search

Stephen McGough ⋅ Amir Atapour-Abarghouei
1:00 PM - 6:00 PM
Workshop

End-to-End 3D Learning

Zhiwen Fan ⋅ Dimitris Metaxas
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM

Machine-learned visual systems are transforming numerous fields such as autonomous driving, biodiversity assessment, and ecological monitoring, but they hunger for vast, high-quality annotated data. Asking domain experts to manually annotate large-scale data is unrealistic; the current paradigm to scale up data annotation is to have domain experts craft annotation guidelines using visual examples and descriptions for non-expert annotators to apply. This paradigm is commonly adopted by companies which provide data labeling services. Lacking domain knowledge, ordinary annotators often produce annotations that are erroneous, subjective, biased, and inconsistent. Further, this process is labor-intensive, tedious, and costly. This workshop aims to pioneer auto-annotation, developing AI agents that can interpret expert-crafted annotation guidelines and generate labels automatically. In essence, we seek to replace ordinary human annotators with AI.

... more
Workshop

Machine Unlearning for Vision

Alessio Sampieri ⋅ Bardh Prenkaj
1:00 PM - 6:00 PM
Workshop

2nd Workshop on Multimodal Sign Language Recognition

Raffaele Mineo ⋅ Hamzah Luqman
1:00 PM - 6:00 PM

MSLR 2026 is the second edition of a rapidly growing venue on multimodal sign language recognition and translation. The program combines invited talks, a peer-reviewed track published in CVPR Workshops, and the SignEval Challenge featuring updated datasets for isolated LIS and continuous SLR. We emphasize privacy-preserving sensing (e.g., radar), healthcare accessibility, and inclusive practices with sign interpreters. Building on the success at ICCV 2025, MSLR 2026 will consolidate a global, interdisciplinary community spanning computer vision, linguistics, healthcare, and Deaf studies.

... more
Workshop

The 3rd MetaFood Workshop (MTF)

Yuhao Chen ⋅ Petia Radeva
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop

Workshop on Multimodal Human Motion Analysis

Olivia Nocentini ⋅ Rishabh Dabral
1:00 PM - 6:00 PM