CVPR 2026 Wednesday 06/3

Registration

Registration / Badge Pickup

7:00 AM - 5:00 PM

Tutorial

The Principles of Diffusion Models: Real-Time Continuous & Discrete Diffusion

Chieh-Hsin Lai · Subham Sahoo · Dongjun Kim · Yang Song · Yuki Mitsufuji · Stefano Ermon

8:00 AM - 12:00 PM

In recent years, diffusion models have become a central paradigm in computer vision, powering advances in image synthesis, editing, and video generation. However, existing tutorials are often fragmented, focusing either on specific applications or isolated methodological perspectives without a unifying framework. This tutorial aims to present a principle-driven view of diffusion models by distilling their foundations into a small set of core ideas that unify variational, score-based, and flow-based approaches. It further emphasizes emerging directions in real-time generation through flow-map models, which enable fast and interactive visual applications. In addition, the tutorial extends this framework to discrete and tokenized diffusion models, highlighting their role in bridging continuous vision generation with multimodal and structured representations.

... more

Tutorial

Tom Builds, Tom Breaks: Hands-On Attacks and Defenses for Vision-Language Systems

Pavan Reddy

8:00 AM - 12:00 PM

Vision-language models are increasingly deployed in real-world systems where images can directly influence decisions and actions, creating new security risks beyond traditional text-based attacks. This tutorial provides a hands-on introduction to attacks and defenses for vision-language systems, using a practical, end-to-end workflow that mirrors real deployment scenarios. It covers a range of vulnerabilities, including visual jailbreaks, preprocessing-induced attacks, adversarial perturbations, backdoored models, and data poisoning, along with corresponding mitigation strategies. Through interactive examples and reproducible notebooks, the tutorial emphasizes how these threats manifest in practice and how to build robust, auditable systems for multimodal AI.

... more

Tutorial

Edge AI in Action: Mastering On-Device Inference

Fabricio Batista Narcizo · Elizabete Munzlinger · Sai Narsi Reddy Donthi Reddy · Shan Ahmed Shaffi

8:00 AM - 12:00 PM

Edge AI enables real-time, low-latency inference directly on devices, but achieving high performance and efficiency requires specialized optimization and deployment techniques tailored to heterogeneous hardware. This tutorial provides a hands-on guide to on-device inference, focusing on end-to-end workflows for optimizing and deploying deep learning models on leading edge platforms such as Qualcomm Snapdragon and NVIDIA Jetson. It covers key techniques including model compression, quantization, and hardware-aware optimization, along with practical tools such as SNPE and TensorRT. Through comparative analysis and real-world case studies, the tutorial highlights best practices for achieving efficient, low-latency performance in applications ranging from computer vision to multimodal AI.

... more

Tutorial

Towards Safe Multi-Modal Learning: Evolving Threats and Safety Solutions

Xi Li · Manling Li · Muchao Ye

8:00 AM - 12:00 PM

Multi-modal learning has enabled powerful systems that combine text, images, audio, and video for perception, reasoning, and decision-making. At the same time, it has introduced safety challenges that differ fundamentally from those in traditional uni-modal learning. This tutorial presents a structured overview of the evolving safety landscape in multi-modal AI, focusing on emerging threat models and corresponding defense strategies. It examines key risks such as compromised modality integration, modality misalignment, and fused cross-modal vulnerabilities, and reviews recent work on adversarial attacks, jailbreaks, hallucinations, and safety solutions for more reliable multi-modal systems.

... more

Workshop

Workshop on "Bitter Lessons"

Anand Bhattad ⋅ Aditya Prakash ⋅ Unnat Jain ⋅ Svetlana Lazebnik

8:00 AM - 12:00 PM

Workshop

Generative AI for XR and Identity-based Applications

Brendan David-John ⋅ Chris Thomas

8:00 AM - 12:30 PM

Workshop

GRAIL-V: Grounded Retrieval & Agentic Intelligence for Vision-Language

Amit Agarwal ⋅ Jyotika Singh ⋅ Vivek. Gupta

8:00 AM - 12:00 PM

Workshop

The 3rd Workshop on Human Motion Generation - New Perspective on Simulation, Animation, and VR applications

Chuan Guo ⋅ Yuxuan Mu

8:00 AM - 12:00 PM

Workshop

LatinX in Computer Vision Research Workshop

Francisco Lopez-Tiro ⋅ Dustin Carrión-Ojeda ⋅ Willams De Lima ⋅ Hernan Dario Benitez ⋅ Ana Maria Quintero ⋅ William de Lima

8:00 AM - 12:00 PM

Workshop

Multimodal Foundation Models for Biomedicine: Challenges and Opportunities

Yuhui Zhang ⋅ Xiaohan Wang

8:00 AM - 12:00 PM

Workshop

The 2nd Workshop on Multimodal Spatial Intelligence

Juil Koo ⋅ Phillip Y. Lee

8:00 AM - 12:00 PM

Workshop

On Sensor Vision Workshop

Andrew J. Davison ⋅ Shinjeong Kim

8:00 AM - 12:45 PM

Workshop

22nd Workshop on Perception Beyond the Visible Spectrum

Riad I. Hammoud ⋅ Yi Ding

8:00 AM - 12:00 PM

Workshop

The 2nd International Workshop & Challenge on Subtle Visual Computing @CVPR 2026

Zitong Yu ⋅ Xun Lin

8:00 AM - 12:00 PM

Workshop

1st Workshop on Video World Models: Interaction, Memory, and Efficiency

Jiwen Yu ⋅ Xihui Liu

8:00 AM - 12:00 PM

Workshop

Women in Computer Vision

Karen Sanchez ⋅ Carla Muntean

8:00 AM - 12:00 PM

Workshop

Workshop on World Models Meet Active Sensing and Closed-Loop Planning

Jieneng Chen ⋅ Alan Yuille ⋅ Tianmin Shu ⋅ Rama Chellappa ⋅ Jianwen Xie ⋅ Yilun Du ⋅ Chen Wei

8:00 AM - 12:00 PM

Generative world models have mastered passive generation. But real intelligence is active. It observes, plans, acts, and learns from feedback. We focus on world models with active sensing and closed-loop planning.

... more

Workshop

The 5th Explainable AI for Computer Vision (XAI4CV) Workshop

Miguel-Ángel Fernández-Torres ⋅ Jon Donnelly ⋅ Maximilian Dreyer ⋅ Marina Gavrilova ⋅ Dahye Kim ⋅ Indu Panigrahi ⋅ Sukrut Rao ⋅ Avinab Saha ⋅ Lenka Tětková ⋅ Yuhui Zhang

8:00 AM - 12:30 PM

Computer vision for high-stakes, real-world applications necessitates robust explanation and transparency to foster trust, accountability, and ethical deployment. Celebrating its 5th anniversary, the Explainable AI for Computer Vision (XAI4CV) workshop provides a premier forum for the entire spectrum of XAI research, from interpretable-by-design models to the challenges of multimodal foundation models. The program includes three invited talks, spotlight papers, a tutorial on concept-based explanations for the diagnosis and control of vision foundation models, and a poster session. XAI4CV accepts paper and demo submissions aimed at defining the future of trustworthy visual AI.

... more

Workshop

PHAROS AI Factory for Medical Imaging & Healthcare

Stefanos Kollias ⋅ Xujiong Ye

8:00 AM - 12:30 PM

Workshop

Workshop on Agentic AI for Visual Media

Jinjin Gu ⋅ Lei Sun

8:00 AM - 5:00 PM

Workshop

Bridging Vision, Language, and Action: What’s Missing in Actionable Visual Perception for Robotics

Jiawei Ma ⋅ Chengzhi Mao

8:00 AM - 5:00 PM

Workshop

Autonomous Understanding Through Open-world Perception and Integrated Language models for On-road Tasks

Ali AlShami ⋅ Ryan Rabinowitz

8:00 AM - 5:00 PM

Workshop

Foundation Models for Autonomous Driving

Walter Zimmer ⋅ Rui Song

8:00 AM - 6:00 PM

Workshop

From Lab Demos to Daily Tasks: Embodied Intelligence in the Wild

Huijie Wang ⋅ Hongyang Li

8:00 AM - 5:00 PM

Workshop

13th Workshop on Fine-grained Visual Categorization

Nico Lang ⋅ Lukas Picek

8:00 AM - 5:00 PM

Workshop

4th Workshop on Vision Based Industrial Inspection

Shancong Mou ⋅ Hao Yan

8:00 AM - 5:00 PM

Workshop

The 1st Workshop on Deployment of Foundation Models for Embodied AI

Burhan Yaman ⋅ Xin Ye

8:00 AM - 5:00 PM

Workshop

Workshop on Vision-based Assistants in the Real-World

Apratim Bhattacharyya ⋅ Fadime Sener

8:15 AM - 1:00 PM

Workshop

Multimodal Alignment for a Pluralistic Society

Perampalli Shravan Nayak ⋅ Aishwarya Agrawal

8:20 AM - 12:30 PM

Workshop

AI for Creative Visual Content Generation, Editing and Understanding

Ozgur Kara ⋅ Junho Kim

8:25 AM - 12:35 PM

Workshop

IPA: Interactive Physical AI Workshop

Seonwook Park ⋅ Amrita Mazumdar

8:25 AM - 1:00 PM

Workshop

AI for Content Creation

James Tompkin ⋅ Krishna Kumar Singh

8:30 AM - 12:30 PM

Content creation underpins photography, film and video, gaming, virtual and augmented reality, art, design, fashion, and advertising. In just a few years, generative AI has compressed work that once took hours of painstaking manual effort into seconds of automated or interactive creation. Text-to-image models now produce photorealistic and stylized imagery on demand, while video generation and world models synthesize long, temporally consistent—and increasingly interactive and controllable—clips and environments. Rapid progress in 3D and 4D generation, neural rendering, and Gaussian splatting is bringing the same leap to objects, avatars, and dynamic scenes, and unified multi-modal models now tie together text, image, video, and audio for any-to-any creation. Beyond raw synthesis, the community is advancing instruction-based and controllable editing, real-time and on-device generation, identity- and physics-consistent results, and agentic pipelines that plan and compose creative tasks. These capabilities also raise pressing questions of control, attribution, safety, and bias—while offering powerful new sources of synthetic training data for downstream vision tasks in 2D, video, and 3D.

The AI for Content Creation workshop explores this exciting and fast-moving research area. We bring together invited speakers of world-class expertise in content creation, up-and-coming researchers, and authors of submitted workshop papers, to engage in a day filled with learning, discussion, and network building.

... more

Workshop

The 3rd AI for Visual Arts Workshop and Challenges

Deblina Bhattacharjee ⋅ Bahar Aydemir

8:30 AM - 12:30 PM

Workshop

The 5th DataCV Workshop and Challenge

Liang Zheng ⋅ Yue Yao

8:30 AM - 12:30 PM

Workshop

The 5th Workshop on Federated Learning for Computer Vision

Chen Chen ⋅ Guangyu Sun

8:30 AM - 11:59 AM

Workshop

Generative AI for Sign Language

Hezhen Hu ⋅ Yuecong Min

8:30 AM - 12:30 PM

Workshop

Sense of Space: Multi-Sensory Modeling for Embodied Intelligence

Rao Fu ⋅ Li Guan

8:30 AM - 5:00 PM

Workshop

Visual General Intelligence

Hirokatsu Kataoka ⋅ Yoshihiro Fukuhara

8:30 AM - 6:00 PM

Workshop

AI4RWC: The 2nd International Workshop on Vision Intelligence for Real-world Challenges

Daqian Shi ⋅ Xiaolei Diao

8:30 AM - 12:30 PM

Workshop

Computational Cameras and Displays

Vishwanath Saragadam ⋅ Fei Xia

8:45 AM - 5:00 PM

Workshop

Third Joint Egocentric Vision (EgoVis) Workshop

Siddhant Bansal ⋅ Tushar Nagarajan

8:45 AM - 5:50 PM

Workshop

AERO-HPR: Human Perception and Recognition in Aerial Surveillance

Kien Nguyen Thanh ⋅ Arun Ross

8:50 AM - 12:30 PM

Workshop

2nd Workshop on Photorealistic 3D Head Avatars

Tobias Kirschstein ⋅ Simon Giebenhain

8:50 AM - 12:30 PM

Workshop

Efficient Deep Learning for Computer Vision

Shuai Zhang ⋅ Yung-Hsiang Lu

8:50 AM - 3:30 PM

Tutorial

Accelerated Diffusion Models: From Theory to Interactive World Models

Julius Berner · Weili Nie · Arash Vahdat

9:00 AM - 12:15 PM

Diffusion models have become a cornerstone of modern generative modeling, but their practical deployment in interactive applications is often limited by slow and computationally expensive sampling. This tutorial focuses on recent advances in accelerating diffusion models, providing a comprehensive overview of methods that enable fast and efficient generation. It covers general acceleration techniques, training-based approaches such as distillation into few-step samplers, and practical strategies for scaling to image and video generation. The tutorial further highlights how these advances enable emerging applications such as interactive world models and real-time generative systems, and provides hands-on guidance through the FastGen library.

... more

Workshop

The 3rd Workshop on AI for Content Generation, Quality Enhancement and Streaming

Marcos V. Conde ⋅ Radu Timofte

9:00 AM - 1:00 PM

Workshop

The 22nd Embedded Vision Workshop

Matteo Poggi ⋅ Tse-Wei Chen

9:00 AM - 12:30 PM

Workshop

The 3rd Workshop on Foundation Models for Medical Vision

Jun Ma ⋅ Yuyin Zhou

9:00 AM - 12:30 PM

Workshop

12th Workshop on Medical Computer Vision

Zongwei Zhou ⋅ Yucheng Tang

9:00 AM - 4:00 PM

Workshop

Urban Scene Modeling: Structured, Semantic, and Synthetic 3D Habitats

Jack Langerman ⋅ Ruisheng Wang

9:00 AM - 6:00 PM

Workshop

Workshop on Autonomous Driving

Vincent Casser ⋅ Jose M. Alvarez

9:15 AM - 6:00 PM

Tutorial

Building GenAI based Simulation Environment for End-to-End Autonomous Driving

Henry Liu · Howie Sun · Jun Gao · Shuo Feng · Xintao Yan · Jiawei Wang

1:00 PM - 5:00 PM

Generative AI is transforming simulation for autonomous driving, enabling data-driven and closed-loop environments that better capture the complexity of real-world scenarios. This tutorial presents an end-to-end framework for building generative simulation pipelines tailored to modern learning-based driving systems. It covers key components including world modeling and city-scale digital twins, generative synthesis of rare and safety-critical scenarios, and realistic sensor and video simulation using both graphics and neural approaches. The tutorial further discusses system-level evaluation and integration with autonomous driving stacks, providing practical guidance and open-source tools for developing scalable and reliable simulation environments.

... more

Tutorial

From Perception to Simulation: The Emergence of World Models in Multi-modal Reasoning

Yujun Cai · Jianfei Cai · Yiwei Wang · Ming-Hsuan Yang

1:00 PM - 5:00 PM

World models are emerging as a new paradigm in computer vision and multimodal learning, enabling systems to move beyond perception toward reasoning, simulation, and decision-making. This tutorial explores how world models have evolved from predictive frameworks into engines for multi-modal reasoning, capable of simulating environments, supporting counterfactual thinking, and enabling planning. It examines key approaches for learning world dynamics from visual data, including both discrete tokenization and diffusion-based methods, and highlights their role in modeling physical and causal structure. The tutorial further covers how these models support reasoning through simulation, as well as their applications in embodied agents and robotics, while discussing key challenges such as grounding, scalability, and causal understanding.

... more

Tutorial

Monte Carlo physical simulation

Rohan Sawhney · Bailey Miller · Ioannis Gkioulekas · Keenan Crane

1:00 PM - 5:00 PM

Partial differential equations (PDEs) play a central role in physics-based modeling across vision, graphics, and robotics, but conventional grid-based solvers often struggle with scalability and complex geometry. This tutorial introduces grid-free Monte Carlo methods for solving PDEs, focusing on algorithms such as walk on spheres and walk on stars that eliminate the need for spatial discretization. It presents the theoretical foundations of these methods alongside practical techniques for efficient sampling, variance reduction, and differentiable simulation. The tutorial also highlights applications in vision and robotics, including inverse problems and physics-based learning, and provides hands-on guidance for implementing Monte Carlo PDE solvers in real-world systems.

... more

Tutorial

Principled Interpretability in Vision Models: From Mechanistic Understanding to Interpretable Models by Design

Tsui-Wei (Lily) Weng · Tuomas Oikarinen

1:00 PM - 5:00 PM

As deep learning systems are increasingly deployed in high-stakes applications, understanding their internal behavior is essential for ensuring trust, safety, and reliability. However, the field of interpretability remains fragmented, spanning diverse methods without a unified framework or standardized evaluation. This tutorial aims to provide a comprehensive overview of interpretability in vision models, bridging post-hoc mechanistic analysis with approaches that design inherently interpretable models. It reviews techniques for analyzing neural networks at multiple levels—from individual neurons to circuits—alongside recent advances in evaluating the faithfulness of explanations. In addition, the tutorial covers emerging methods for learning interpretable models by design, such as concept-based approaches, and highlights practical applications in debugging, model editing, and safety auditing.

... more

Workshop

GigaBrain Challenge 2026: Workshop on World Models Empowering Vision Language Action Model

Zheng Zhu ⋅ Xiaofeng Wang ⋅ Hongyang Li ⋅ Shanghang Zhang ⋅ Yao Mu ⋅ Haoqiang Fan ⋅ Zhizhong Su

1:00 PM - 6:00 PM

The GigaBrain Challenge 2026 aims to push the boundaries of embodied intelligence. The workshop features four competition tracks focused on simulation, world models, real-world robot evaluation, and creative agentic demos for embodied AI, along with a Best Paper session showcasing outstanding research contributions. Beyond the formal program, participants will also have the opportunity to hear firsthand know-how sharing from organizers, competition winners, and researchers on key topics such as world models, VLA, embodied foundation model evaluation, simulation, and agentic systems for embodied intelligence. I

... more

Workshop

The Second CVPR Workshop on Foundation and Large Vision Models in Remote Sensing (MORSE)

Saurabh Prasad ⋅ Jocelyn Chanussot

1:00 PM - 5:45 PM

Workshop

The 2nd 3D-LLM/VLA Workshop: Bridging Language, Vision and Action in 3D Environments

Yining Hong ⋅ Wenbo hu

1:00 PM - 6:15 PM

Workshop

10th Affective & Behavior Analysis in-the-wild

Dimitrios Kollias ⋅ Panagiotis Tzirakis

1:00 PM - 6:00 PM

Workshop

Authenticity & Provenance in the age of Generative AI

Shruti Agarwal ⋅ Sarah Barrington

1:00 PM - 6:00 PM

Workshop

Auto-Annotation with Expert-Crafted Guidelines

Shu Kong ⋅ Sara Beery

1:00 PM - 5:00 PM

Machine-learned visual systems are transforming numerous fields such as autonomous driving, biodiversity assessment, and ecological monitoring, but they hunger for vast, high-quality annotated data. Asking domain experts to manually annotate large-scale data is unrealistic; the current paradigm to scale up data annotation is to have domain experts craft annotation guidelines using visual examples and descriptions for non-expert annotators to apply. This paradigm is commonly adopted by companies which provide data labeling services. Lacking domain knowledge, ordinary annotators often produce annotations that are erroneous, subjective, biased, and inconsistent. Further, this process is labor-intensive, tedious, and costly. This workshop aims to pioneer auto-annotation, developing AI agents that can interpret expert-crafted annotation guidelines and generate labels automatically. In essence, we seek to replace ordinary human annotators with AI.

... more

Workshop

Cognitive Foundations for Multimodal Models

Aditya Chinchure ⋅ Sahithya Ravi

1:00 PM - 5:00 PM

Workshop

Computer Vision for the Built World

Iro Armeni ⋅ Fuxin Li

1:00 PM - 6:00 PM

Workshop

Computer Vision with Small Data: Beyond Scale -- Toward Data-Efficient Dynamically-Aware Video Intelligence

Sarah Ostadabbas ⋅ Shayda Moezzi

1:00 PM - 5:00 PM

Workshop

Computer Vision for Biomechanics Workshop

Ethan Goan ⋅ Akila Pemasiri

1:00 PM - 6:00 PM

Workshop

Sixth Workshop on Neural Architecture Search

Stephen McGough ⋅ Amir Atapour-Abarghouei

1:00 PM - 5:00 PM

Workshop

DataMFM: Emerging Directions in Data for Multimodal Foundation Models

Pengyuan Li ⋅ Zihan Wang

1:00 PM - 6:05 PM

Workshop

End-to-End 3D Learning

Zhiwen Fan ⋅ Dimitris Metaxas

1:00 PM - 6:00 PM

Workshop

3rd Workshop on Efficient and On-Device Generation (EDGE), CVPR 2026

Felix Juefei-Xu ⋅ Tingbo Hou

1:00 PM - 6:00 PM

Workshop

1st Workshop on Multi-Agent Robotic Systems: Scaling with Compositional Intelligence

Yiran Qin ⋅ Zhenfei Yin

1:00 PM - 5:00 PM

Workshop

The 5th Workshop on “What is Next in Multimodal Foundation Models?”

Edson Araujo ⋅ Roei Herzig

1:00 PM - 6:00 PM

Workshop

Workshop on Multimodal Human Motion Analysis

Olivia Nocentini ⋅ Rishabh Dabral

1:00 PM - 6:00 PM

Workshop

The 1st Workshop on Monitoring the World through an Imperfect Lens

Miriam Cha ⋅ Greg Angelides ⋅ Miriam Cha

1:00 PM - 5:00 PM

Workshop

2nd Workshop on Multimodal Sign Language Recognition

Raffaele Mineo ⋅ Hamzah Luqman

1:00 PM - 5:30 PM

MSLR 2026 is the second edition of a rapidly growing venue on multimodal sign language recognition and translation. The program combines invited talks, a peer-reviewed track published in CVPR Workshops, and the SignEval Challenge featuring updated datasets for isolated LIS and continuous SLR. We emphasize privacy-preserving sensing (e.g., radar), healthcare accessibility, and inclusive practices with sign interpreters. Building on the success at ICCV 2025, MSLR 2026 will consolidate a global, interdisciplinary community spanning computer vision, linguistics, healthcare, and Deaf studies.

... more

Workshop

The 3rd MetaFood Workshop (MTF)

Yuhao Chen ⋅ Petia Radeva

1:00 PM - 6:00 PM

Workshop

Machine Unlearning for Vision

Alessio Sampieri ⋅ Bardh Prenkaj

1:00 PM - 6:00 PM

Workshop

OpenSUN3D: 6th Workshop on Open-World 3D Scene Understanding with Foundation Models

Francis Engelmann ⋅ Anna-Maria Halacheva

1:00 PM - 5:00 PM

Workshop

Synthetic & Adversarial ForEnsics

Josué Martínez-Martínez ⋅ Pooya Khorrami

1:00 PM - 6:00 PM

Workshop

3rd Workshop on ScanNet++ Novel View Synthesis and 3D Semantic Understanding Challenge

Angela Dai ⋅ Matthias Nießner ⋅ Chandan Yeshwanth

1:00 PM - 5:30 PM

Workshop

The 7th International Workshop and CVML Challenge on Agriculture-Vision: Challenges & Opportunities for Computer Vision in Agriculture

Chris Padwick ⋅ Ripudaman Arora

1:00 PM - 6:00 PM

Workshop

The 1st Workshop on Vision for Intelligent Task Assistants

Ehsan Elhamifar ⋅ Jason J. Corso

1:00 PM - 5:00 PM

Workshop

Second Workshop on Foundation and Generative Models in Biometrics

Hatef Otroshi Shahreza ⋅ Vitomir Struc

1:15 PM - 6:00 PM

Workshop

Rediscovering Intelligence: Can AI Still Learn from Humans?

Xi Wang ⋅ Yen-Ling Kuo

1:20 PM - 5:30 PM

Workshop

The 2nd Workshop on Test-time Scaling for Computer Vision

Yinpeng Dong ⋅ Yichi Zhang

1:25 PM - 5:30 PM

Tutorial

3D Human Mesh Modeling and Recovery from RGB and LiDAR

Romain Bregier · Istvan Sarandi · Salma Galaaoui · Fabien Baradel · Nermin Samet · David Picard

1:30 PM - 4:45 PM

Understanding human pose and shape through parametric body models is a key enabler of applications from AR/VR and sports analysis to human-robot interaction. This tutorial provides an in-depth overview of parametric body models and their role in Human Mesh Recovery. We cover fundamental principles and recent developments, guiding practitioners through major models (e.g., SMPL, Anny, MHR, SOMA) and their trade-offs. We then present state-of-the-art Human Mesh Recovery methods, with a focus on challenging in-the-wild settings across different input modalities, including single- and multi-view RGB, video, depth and LiDAR.

... more

Workshop

Spatial Intelligence for Cultural Heritage

Marina Paolanti ⋅ Roberto Pierdicca ⋅ Jing Zhang ⋅ Emanuele Balloni

1:30 PM - 5:45 PM

Workshop

The 5th Workshop on Transformers for Vision and Multimodal AI

Gedas Bertasius ⋅ Zhiding Yu

1:45 PM - 5:40 PM

Workshop

The 1st Workshop on AI-assisted Long Video Creation

Yudong Jiang ⋅ Lisai Zhang

2:00 PM - 5:00 PM