CVPR 2026 Thursday 06/4

Registration

Registration / Badge Pickup

7:00 AM - 5:00 PM

Workshop

3D Geometry Generation for Scientific Computing (2nd Edition)

Wuyang Chen ⋅ Marissa Ramirez de Chanlatte ⋅ Peter Yichen Chen ⋅ Chuhang Zou ⋅ Zhiwen Fan ⋅ Daniel Martin ⋅ Michael Mahoney

7:30 AM - 12:30 PM

Workshop

2nd Workshop on Knowledge-Intensive Multimodal Reasoning

Arman Cohan ⋅ Yilun Zhao

7:30 AM - 12:30 PM

Tutorial

Recent Advances in AI for Medical Imaging: Progress, Challenges, and Future Directions

Jiaqi Wang · Peirong Liu · Can Zhao

8:00 AM - 12:00 PM

Artificial intelligence has driven significant advances in medical imaging, improving tasks such as image reconstruction, diagnosis, and clinical decision support across modalities including MRI, CT, X-ray, and pathology. This tutorial provides an up-to-date overview of key paradigms in the field, including physics-informed learning, medical foundation models, and collaborative approaches such as federated and multi-agent systems. It examines major challenges such as generalization, interpretability, data heterogeneity, and privacy constraints, while highlighting emerging solutions and open research directions. The tutorial aims to offer a comprehensive perspective on the development and deployment of reliable, clinically relevant AI systems for medical imaging.

... more

Tutorial

Computer Vision at Scale: Multi-Camera Tracking, Calibration, and Event Detection for Checkout-Free Retail

Hareesh Kolluru · Motilal Agarwal · Tanmay Bangalore

8:00 AM - 12:00 PM

Large-scale multi-camera systems are central to real-world computer vision applications, yet their design is shaped as much by infrastructure constraints as by algorithmic advances. This tutorial presents a unified perspective on multi-camera vision through the lens of checkout-free retail, focusing on three core components: automatic camera calibration, real-time multi-object tracking, and structured event detection. It examines how challenges such as asynchrony, partial observability, hardware failures, and edge deployment constraints influence system design and performance. The tutorial further highlights generalizable principles for building reliable, scalable vision systems, bridging the gap between academic methods and real-world deployment.

... more

Tutorial

Extending Computer Vision to Hidden Objects: A Tutorial on Millimeter-Wave Imaging and Reconstruction of Occluded Scenes

Mingmin Zhao · Laura Dodds

8:00 AM - 12:00 PM

Millimeter-wave (mmWave) sensing is emerging as a new modality for computer vision, enabling perception of objects and scenes that are occluded or invisible to traditional cameras. This tutorial introduces the fundamentals of mmWave imaging, highlighting how its physical properties enable through-occlusion sensing and all-weather perception. It covers both classical signal-processing approaches and recent learning-based methods for 3D reconstruction, segmentation, and scene understanding. The tutorial further provides practical guidance on datasets, tools, and open challenges, offering a comprehensive entry point for researchers interested in extending vision systems beyond visible light.

... more

Tutorial

The Full Stack of Physical AI: Simulation, Foundation Models, and Edge Deployment for Next-Generation Robotics Applications

Raymond Lo · Johnny Nunez · Chitoku Yato · Spencer Huang · Mitesh Patel

8:00 AM - 12:00 PM

Physical AI systems, including robotics and autonomous platforms, require tightly integrated pipelines spanning data collection, model training, and real-time deployment. This tutorial presents a full-stack perspective on building such systems, covering simulation-based data generation, foundation models for robot control, and deployment on edge hardware. It introduces practical workflows using modern tools for human-in-the-loop data collection, multimodal robot foundation models, and hardware-aware optimization for low-latency inference. The tutorial further highlights challenges in scaling and deploying physical AI systems, providing attendees with actionable guidance and open-source resources for end-to-end robotics development.

... more

Workshop

Third Workshop for Learning 3D with Multi-View Supervision

Abdullah J Hamdi ⋅ Silvio Giancola

8:00 AM - 12:30 PM

Workshop

6th Workshop on 3D Scene Understanding for Vision, Graphics, and Robotics

Yixin Chen ⋅ Shaofei Wang

8:00 AM - 12:00 PM

Workshop

Workshop on Any-to-any Multimodal Learning

Shengqiong Wu ⋅ Wei Dai

8:00 AM - 12:00 PM

Workshop

The 3rd Workshop on New Trends in AI-Generated Media and Security

Shu Hu ⋅ Xin Wang

8:00 AM - 12:30 PM

Workshop

2nd Workshop on Computer Vision for Children

Yifan Shen ⋅ Xu Cao

8:00 AM - 12:30 PM

Workshop

The 5th Workshop on Computer Vision in the Wild: Towards Unified Multimodal Agents For Reasoning in the Wild

Reuben Tan ⋅ Zhengyuan Yang

8:00 AM - 12:00 PM

Workshop

The Second Workshop on the Evaluation of the Generative Foundation Models

Wisdom Ikezogwo ⋅ Maria Zontak

8:00 AM - 12:00 PM

Workshop

Geometry-Free Novel View Synthesis and Controllable Video Models

Andrea Tagliasacchi

8:00 AM - 12:30 PM

Workshop

Humans of Generative AI

Jaron Mink ⋅ David Forsyth

8:00 AM - 12:00 PM

Workshop

The 1st Workshop on Low‑Level Vision Frontiers with Generative AI, Preference Optimization, and Agentic Systems

Xin Li ⋅ Yeying Jin

8:00 AM - 12:30 PM

Workshop

6th Omnidirectional Computer Vision Workshop

Pierre Moulon ⋅ Guillaume Caron

8:00 AM - 12:10 PM

Workshop

Open-World Vision

Shu Kong ⋅ Neehar Peri

8:00 AM - 12:00 PM

Open-World Vision (OWV) emphasizes realistic opportunities and challenges in developing and deploying computer vision systems in the dynamic, vast, and unpredictable real open world, which offers abundant data that can benefit training and challenge testing. It contrasts the traditional "closed-world" paradigm of visual learning and inference, which assumes fixed, known data distributions and categorical labels. Models developed under such closed-world assumptions tend to be brittle when encountering ever-changing and novel scenarios in the real open world. Modern visual learning has shifted towards an open-world paradigm, such as pretraining foundation models on massive data sourced from the open world (e.g., web-sourced data). While these models show unprecedented performance and strong adaptability to downstream tasks, they inherit biases from their open-world pretraining data and can still fail in truly novel or underrepresented scenarios during deployment. This workshop aims not only to uncover current limitations, potential risks, emerging opportunities, and unresolved challenges of open-world vision, but also to solicit solutions that advance the field toward more robust, fair, and adaptable visual systems.

... more

Workshop

From Perception to Persuasion: Challenges and Advances in Misinformation Detection in Society

PRIYANKA SINGH ⋅ Xue Li

8:00 AM - 12:20 PM

Workshop

SPAR-3D: Security, Privacy, and Adversarial Robustness in 3D Generative Vision Models

Nicole Meng ⋅ Yingjie Lao

8:00 AM - 12:00 PM

Workshop

Trustworthy, Robust, Uncertainty-Aware, and Explainable Visual Intelligence and Beyond

Tsui-Wei Weng ⋅ Nghia Hoang

8:00 AM - 12:30 PM

Workshop

The 8th UG2+ Workshop and Challenge: Bridging the Gap between Computational Photography and Visual Perception

Alex Wong ⋅ Dong Lao

8:00 AM - 12:00 PM

Workshop

Unified Robotic Vision with Cross-Modal Sensing and Alignment

Zongwei Wu ⋅ Christos Sakaridis

8:00 AM - 12:30 PM

Workshop

9th International Workshop on Visual Odometry and Computer Vision Applications Based on Location Clues

Guoyu Lu ⋅ Friedrich Fraundorfer

8:00 AM - 12:00 PM

Workshop

11th Workshop on Computer Vision and Multimodal Microscopy Image Analysis

Steve Finkbeiner ⋅ Mei Chen

8:00 AM - 5:00 PM

Workshop

The Seventh Annual Embodied Artificial Intelligence Workshop

Anthony Francis ⋅ David Hall

8:00 AM - 5:00 PM

Workshop

2nd Workshop on Agents in Interaction, from Humans to Robots

Yufei Ye ⋅ Homanga Bharadhwaj

8:00 AM - 5:00 PM

Workshop

Mobile AI workshop and associated challenges, 6th edition

Andrey Ignatov ⋅ Radu Timofte

8:00 AM - 5:00 PM

Workshop

Multi-Agent Embodied Intelligent Systems Meet Agentic-AI era: Opportunities, Challenges and Futures

Xiangbo Gao ⋅ Yuheng Wu

8:00 AM - 5:00 PM

Workshop

11th New Trends in Image Restoration and Enhancement Workshop and Challenges

Radu Timofte ⋅ Zongwei Wu

8:00 AM - 6:00 PM

Workshop

Video Generative Models: Benchmarks and Evaluation

Shuo Xing ⋅ Mingyang Wu

8:00 AM - 5:00 PM

Workshop

2nd Workshop on Video Large Language Models

Rohit Gupta ⋅ Sirnam Swetha

8:00 AM - 5:00 PM

Workshop

Workshop on Visual Concepts

Joy Hsu ⋅ R. Kenny Jones

8:00 AM - 5:00 PM

Workshop

Sight and Sound

Andrew Owens ⋅ Jiajun Wu

8:00 AM - 5:00 PM

Workshop

4th Workshop on Maritime Computer Vision

Benjamin Kiefer ⋅ Jon Muhovic

8:10 AM - 12:30 PM

Tutorial

Analytic understanding of diffusion models

Artem Lukoianov · Chenyang Yuan · Christopher Scarvelis · Mason Kamb

8:30 AM - 5:00 PM

Diffusion models achieve state-of-the-art performance in generative modeling, yet their theoretical foundations and generalization behavior remain poorly understood. This tutorial focuses on the analytical understanding of diffusion models, addressing the apparent paradox between closed-form optimal denoisers and the empirical success of deep diffusion networks. It introduces recent theoretical advances that explain how mechanisms such as score smoothing, training dynamics, neural network inductive biases, and data structure contribute to generalization. By combining mathematical insights with hands-on experiments, the tutorial provides a principled framework for understanding the inner workings of diffusion models and for interpreting recent developments in the field.

... more

Workshop

6th Workshop on CV4Animals: Computer Vision for Animal Behavior Tracking and Modeling

Mitchell Rogers ⋅ Tuan-Anh Vu ⋅ Xiaoxuan Ma ⋅ Isla Duporge ⋅ Shangzhe Wu

8:30 AM - 12:30 PM

Workshop

Exploring the Next Generation of Data

Nadine Chang ⋅ Maying Shen

8:30 AM - 12:00 PM

Workshop

Personalization in Generative AI Workshop

Pinar Yanardag ⋅ Nupur Kumari

8:30 AM - 12:30 PM

Workshop

PhysHuman: Physically Grounded Human Perception and Modeling

Feng Liu ⋅ Youngjoong Kwon ⋅ Cheng Zhang

8:30 AM - 12:30 PM

Workshop

Safe Artificial Intelligence for All Domains

Oliver Wasenmüller ⋅ Markus Enzweiler

8:30 AM - 12:30 PM

Workshop

VizWiz Grand Challenge: Interpreting Images and Videos Taken by Blind People

Danna Gurari ⋅ Neelima Prasad

8:45 AM - 12:20 PM

Workshop

4th Workshop on Generative Models for Computer Vision

Adam Kortylewski ⋅ Fangneng Zhan

8:45 AM - 5:00 PM

Workshop

9th Multimodal Learning and Applications Workshop

Paolo Rota ⋅ Michael Ying Yang

8:45 AM - 6:30 PM

Workshop

Multimodal Algorithmic Reasoning Workshop

Anoop Cherian ⋅ Suhas Lohit

8:55 AM - 12:30 PM

Tutorial

All You Need To Know About Self-Driving

Raquel Urtasun · Abbas Sadat · Sivabalan Manivasagam · Jingkang Wang · Ioan Andrei Barsan

9:00 AM - 5:30 PM

Autonomous driving has evolved into a complex, full-stack problem that integrates perception, prediction, planning, simulation, and safety within a unified system. This tutorial provides a comprehensive overview of modern self-driving pipelines, covering both traditional modular approaches and emerging end-to-end paradigms. It reviews key components including sensor systems, multi-modal perception, motion forecasting, planning and control, large-scale simulation, and data-centric development. The tutorial further highlights recent advances such as foundation models, generative simulation, and real-world deployment at scale, offering a unified perspective on current challenges and future directions in self-driving systems.

... more

Workshop

The Eighth Workshop on Precognition: Seeing through the Future

Khoa Luu ⋅ Nemanja Djuric

9:00 AM - 12:15 PM

Workshop

The 6th Workshop of Adversarial Machine Learning on Computer Vision: Safety of Vision-Language Agents

Aishan Liu ⋅ Jiakai Wang ⋅ Jin Hu ⋅ Tianyuan Zhang

9:00 AM - 5:00 PM

Workshop

12th IEEE International Workshop on Computer Vision in Sports

Rikke Gade ⋅ Silvio Giancola

9:00 AM - 5:30 PM

Workshop

EarthVision: Large Scale Computer Vision for Remote Sensing Imagery

Ronny Haensch ⋅ Devis Tuia

9:00 AM - 5:00 PM

Workshop

Embodied Reasoning in Action: Workshop and Challenge on Embodied Reasoning for Robotic Manipulation

Jiafei Duan ⋅ Jason Ren

9:00 AM - 5:00 PM

Workshop

2nd Workshop on Human-Interactive Generation and Editing

Jinbo Xing ⋅ Xi Chen

9:00 AM - 5:30 PM

Workshop

How Do Vision Models Work?

Tamar Rott Shaham ⋅ Amil Dravid

9:00 AM - 5:00 PM

Tutorial

Foundations and Frontiers of Watermarking: Algorithms, Multimodal Extensions, Benchmarks, and Authenticity Frameworks

Vishal Asnani · Shruti Agarwal · Benedetta Tondi · Pierre Fernandez · Furong Huang

1:00 PM - 5:00 PM

Watermarking has re-emerged as a critical component of trustworthy AI, driven by the rapid growth of generative models and the need for content attribution and authenticity. This tutorial provides a unified overview of watermarking, spanning classical signal-processing foundations and modern deep-learning–based approaches across images, video, audio, and multimodal data. It examines key challenges such as robustness, capacity, and adversarial resilience, along with recent benchmarking efforts and evaluation frameworks. The tutorial further connects these methods to real-world deployment through applications in content provenance, media forensics, and emerging standards such as C2PA, offering a comprehensive perspective on building reliable and transparent media systems.

... more

Tutorial

From Perception to Action: Building Efficient and Deployable Robot Intelligence Pipelines with Open-Source Edge AI Toolkits

Samet Akcay · Zhuo Wu · Michael Paulitsch · Ashutosh Kumar · Tao Xiong · Adrian Boguszewski · Sameer Sheorey · Benjamin Ummenhofer

1:00 PM - 5:00 PM

Robotic manipulation has become a key application of embodied AI, but many research pipelines remain difficult to reproduce and deploy in real-world systems. This tutorial presents an end-to-end, open-source workflow for building efficient robot intelligence pipelines, covering data collection, visuomotor policy training, simulation, and deployment on edge hardware. It introduces practical techniques such as teleoperated data acquisition, diffusion- and transformer-based policies, and neural object cloning for simulation-ready assets. The tutorial further emphasizes model optimization and real-time deployment, culminating in a live demonstration of a complete perception-to-action pipeline on an affordable robotic platform.

... more

Tutorial

The Road to Convergence: Evolution of Unified Multimodal Models

Jindong Wang · Hao Chen · Jiakui Hu · Zhaolong Su · Sharon Li

1:00 PM - 5:00 PM

Unified multimodal models are emerging as a new paradigm that integrates understanding and generation across modalities within a single foundation model. This tutorial provides a comprehensive overview of these models, addressing the currently fragmented landscape of architectures, representations, and training strategies. It introduces a unified perspective on key design choices, including modeling paradigms, multimodal tokenization, and alignment methods, while reviewing benchmarks and real-world applications. The tutorial further highlights open challenges such as scalable representation learning and unified world modeling, offering a structured roadmap for future research in multimodal AI.

... more

Workshop

1st Workshop on Generative 3D Reconstruction

Daniel Barath ⋅ Fabian Manhardt

1:00 PM - 6:00 PM

Workshop

Medical Reasoning with Vision Language Foundation Models

Anas Zafar ⋅ Muhammad Waqas ⋅ Alejandro Lozano

1:00 PM - 6:00 PM

Workshop

4D Digital Twins: Real-to-Sim-to-Real for Physical AI

Amrita Mazumdar ⋅ Tianye Li

1:00 PM - 6:00 PM

Workshop

2nd Workshop on 4D Vision: Modeling the Dynamic World

Jiahui Lei ⋅ Shangzhe Wu

1:00 PM - 6:00 PM

Workshop

Artificial Intelligence for Space

Daniele Gammelli ⋅ Gabriele Meoni

1:00 PM - 5:00 PM

Workshop

2nd Workshop on GenAI for Storytelling

Andrew Shin ⋅ Yusuke Mori

1:00 PM - 5:00 PM

Workshop

Big Model Adaptation In Computer Vision

Yuki Asano ⋅ Anna Kukleva

1:00 PM - 5:00 PM

Workshop

CVPR 2026 Biometrics Workshop

Bir Bhanu ⋅ Ajay Kumar

1:00 PM - 5:00 PM

Workshop

Bridging AI and Medical Reality: Computer Vision for Real-world Clinical Translation

Yicheng Wu ⋅ Yutong Xie ⋅ Kai Wang

1:00 PM - 6:00 PM

Workshop

Computer Vision × Education: Building a Cross‑Community Agenda for Multimodal Vision in Classrooms

Ekta Sood ⋅ Joyces H Fonteles

1:00 PM - 6:00 PM

Workshop

CV4Science: Using Computer Vision for the Sciences

Utkarsh Mall ⋅ Ye Zhu

1:00 PM - 5:45 PM

Workshop

Domain Generalization: Evolution, Breakthroughs, and Future Horizons (2nd Edition)

Muhammad Haris Khan ⋅ Rishabh Lalla

1:00 PM - 6:00 PM

Workshop

The 2nd CVPR Workshop on Foundation Models Meet Embodied Agents

Manling Li ⋅ Qineng Wang

1:00 PM - 6:00 PM

Workshop

The 7th International Workshop on Eye and Gaze in Computer Vision

Yihua Cheng ⋅ Seonwook Park ⋅ Hyung Jin Chang

1:00 PM - 5:00 PM

Workshop

Eighth Workshop on Image Matching: Local Features and Beyond

Dmytro Mishkin ⋅ Eduard Trulls

1:00 PM - 6:00 PM

Workshop

1st Workshop on Journey to the Awards: Generative AI for Movie-Grade Video Production (J2A), CVPR 2026

Felix Juefei-Xu ⋅ Stephane Grabl

1:00 PM - 6:00 PM

Workshop

The 2nd Workshop on Multi-Modal Reasoning for Agentic Intelligence

Yijiang Li ⋅ Zhenfei Yin

1:00 PM - 5:00 PM

Workshop

4D World Models: Bridging Generation and Reconstruction

Aayush Prakash ⋅ Aashish Rai

1:00 PM - 6:00 PM

Workshop

Third Workshop on Simulation for Autonomous Driving

Yiyi Liao ⋅ Maximilian Igl

1:00 PM - 5:00 PM

Workshop

ScaleBot: The First Workshop on Scalable Robot Learning Systems

Sijin Chen ⋅ Yuxiang Lu

1:00 PM - 5:00 PM

Workshop

The 3rd Workshop on Synthetic Data for Computer Vision

Jieyu Zhang ⋅ Zixian Ma

1:00 PM - 5:30 PM

Workshop

Second Workshop on Skilled Activity Understanding, Assessment & Feedback Generation

Paritosh Parmar ⋅ Brendan Morris

1:15 PM - 6:00 PM

Imagine a world where computer vision-based systems can analyze a video of an athlete, a surgeon, a patient, or a factory worker and instantly provide expert-level actionable feedback---correcting techniques, identifying inefficiencies, and helping people refine their skills in real time. Thanks to rapid progress in video understanding, this vision is becoming reality. AI-powered systems can now analyze complex human activities, assess performance, and generate intelligent feedback, unlocking new possibilities in sports, healthcare, manufacturing, education, rehabilitation, and beyond. Through Expert Keynotes and Invited Contributions, this CVPR 2026 workshop will explore the cutting edge of skilled activity understanding, assessment, and feedback generation, bridging research and real-world applications.

As AI systems become more capable of understanding human expertise, the implications are profound---empowering individuals with personalized coaching, democratized skill development, and scalable training solutions. We invite researchers, industry leaders, and practitioners to join us in shaping the future of AI-powered skill understanding. Whether working on foundational research, applied solutions, or real-world deployment, this workshop is an opportunity and forum to learn about and push the boundaries of how AI perceives, evaluates, and enhances human ability.

... more

Workshop