CVPR 2024 Monday 06/17

HYBRID: Room Arch 204, Seattle Convention Center
SCHEDULE: https://www.cvlai.net/ntire/2024/#schedule
Mon Jun 08:00 - 18:00 PDT

Lunch Break: 12:00-13:00
Poster session: 16:00-18:00

Workshop: Domain adaptation, Explainability and Fairness in AI for Medical Image Analysis (DEF-AI-MIA) Mon 17 Jun 08:00 a.m.

Stefanos Kollias

Workshop: Computer Vision for Mixed Reality Mon 17 Jun 08:00 a.m.

Rakesh Ranjan

Workshop: 8th AI City Challenge Mon 17 Jun 08:00 a.m.

Zheng Tang

Workshop: Efficient Large Vision Models Mon 17 Jun 08:00 a.m.

Auke Wiggers · Amirhossein Habibian

Workshop: SyntaGen: Harnessing Generative Models for Synthetic Visual Datasets Mon 17 Jun 08:25 a.m.

Khoi Nguyen

Multimodal Algorithmic Reasoning Workshop Mon 17 Jun 08:25 a.m.

Anoop Cherian

1st Workshop on Dataset Distillation for Computer Vision Mon 17 Jun 08:30 a.m.

Saeed Vahidian

In the past decade, deep learning has been mainly advanced by training increasingly large models on increasingly large datasets which comes with the price of massive computation and expensive devices for their training.
As a result, research on designing state-of-the-art models gradually gets monopolized by large companies, while research groups with limited resources such as universities and small companies are unable to compete.
Reducing the training dataset size while preserving model training effects is significant for reducing the training cost, enabling green AI, and encouraging the university research groups to engage in the latest research.
This workshop focuses on the emerging research field of dataset distillation which aims to compress a large training dataset into a tiny informative one (e.g. 1\% of the size of the original data) while maintaining the performance of models trained on this dataset. Besides general-purpose efficient model training, dataset distillation can also greatly facilitate downstream tasks such as neural architecture/hyperparameter search by speeding up model evaluation, continual learning by producing compact memory, federated learning by reducing data transmission, and privacy-preserving learning by removing data privacy. Dataset distillation is also closely related to research topics including core-set selection, prototype generation, active learning, few-shot learning, generative models, and a broad area of learning from synthetic data.

Although DD has become an important paradigm in various machine-learning tasks, the potential of DD in computer vision (CV) applications, such as face recognition, person re-identification, and action recognition is far from being fully exploited.
Moreover, DD has rarely been demonstrated effectively in advanced computer vision tasks such as object detection, image segmentation, and video understanding.
Further, numerous unexplored challenges and unresolved issues exist in the realm of DD.
One such challenge pertains to finding efficient methods to modify existing DD workflows or create entirely new ones to address a wide range of computer vision tasks, extending beyond mere image classification.
An additional challenge lies in improving the scalability of dataset distillation (DD) methods to compress real-world datasets beyond the scale of ImageNet.

The purpose of this workshop is to unite researchers and professionals who share an interest in Dataset Distillation for computer vision for developing the next generation of dataset distillation methods for computer vision applications.

Workshop: CV4Science 2025: Using Computer Vision for the Sciences Mon 17 Jun 08:30 a.m.

David Fouhey

5th International Workshop on Large Scale Holistic Video Understanding Mon 17 Jun 08:30 a.m.

Mohsen Fayyaz

2nd Workshop on Foundation Models Mon 17 Jun 08:30 a.m.

Hisham Cholakkal · Teng Xi

Workshop: AIS: Vision, Graphics and AI for Streaming Mon 17 Jun 08:30 a.m.

Marcos V. Conde

4th Workshop on Physics Based Vision meets Deep Learning (PBDL2024) Mon 17 Jun 08:30 a.m.

Shaodi You

First Workshop on Efficient and On-Device Generation (EDGE) Mon 17 Jun 08:30 a.m.

Felix Juefei-Xu · Tingbo Hou

4th International Workshop on Long-form Video Understanding: Towards Multimodal AI Assistant and Copilot Mon 17 Jun 08:30 a.m.

Mike Zheng Shou

Workshop: Foundation Models for Medical Vision Mon 17 Jun 08:30 a.m.

Jun Ma

The 4th Workshop of Adversarial Machine Learning on Computer Vision: Robustness of Foundation Models Mon 17 Jun 08:30 a.m.

Aishan Liu

The 3rd International Workshop on Federated Learning for Computer Vision (FedVision-2024) Mon 17 Jun 08:30 a.m.

Chen Chen

4th Mobile AI Workshop and Challenges Mon 17 Jun 08:30 a.m.

Andrey Ignatov

Workshop: Computer Vision in the Wild Mon 17 Jun 08:30 a.m.

Chunyuan Li

MetaFood Workshop (MTF) Mon 17 Jun 08:30 a.m.

Yuhao Chen

4th Workshop on CV4Animals: Computer Vision for Animal Behavior Tracking and Modeling Mon 17 Jun 08:30 a.m.

Urs Waldmann

1st Workshop on Urban Scene Modeling: Where Vision Meets Photogrammetry and Graphics Mon 17 Jun 08:30 a.m.

Jack Langerman · Ruisheng Wang

The Fifth Workshop on Fair, Data-efficient, and Trusted Computer Vision Mon 17 Jun 08:30 a.m.

Srikrishna Karanam

2nd Workshop on Multimodal Content Moderation Mon 17 Jun 08:30 a.m.

Mei Chen

The 5th Face Anti-Spoofing Workshop Mon 17 Jun 08:30 a.m.

Jun Wan

Workshop on Computer Vision for Fashion, Art, and Design Mon 17 Jun 08:30 a.m.

Ziad Al-Halah

Workshop: AI for 3D Generation Mon 17 Jun 08:30 a.m.

Despoina Paschalidou

Workshop: AI for Content Creation (AI4CC) Mon 17 Jun 08:30 a.m.

James Tompkin · Deqing Sun

The 7th Workshop and Challenge Bridging the Gap between Computational Photography and Visual Recognition (UG2+) Mon 17 Jun 08:30 a.m.

Nicholas M Chimitt

Workshop: ViLMa – Visual Localization and Mapping Mon 17 Jun 08:30 a.m.

Patrick Wenzel

Workshop: New Challenges in 3D Human Understanding Mon 17 Jun 08:30 a.m.

Qianli Ma

First Joint Egocentric Vision (EgoVis) Workshop Mon 17 Jun 08:30 a.m.

Antonino Furnari

Workshop: VAND 2.0: Visual Anomaly and Novelty Detection Mon 17 Jun 08:30 a.m.

Latha Pemula

2nd Workshop on Scene Graphs and Graph Representation Learning Mon 17 Jun 08:30 a.m.

Azade Farshad

Tool-Augmented VIsion Workshop Mon 17 Jun 08:45 a.m.

Ahmet Iscen

Second Workshop for Learning 3D with Multi-View Supervision Mon 17 Jun 08:45 a.m.

Abdullah J Hamdi

Workshop: Prompting in Vision Mon 17 Jun 09:00 a.m.

Amir Bar · Kaiyang Zhou

Tutorial: Xin Jin · Wenjun Zeng · Tao Yang · Yue Song · Nicu Sebe · Xingyi Yang · Xinchao Wang · Shuicheng Yan

Disentanglement and Compositionality in Computer Vision

This tutorial aims to explore the concepts of disentanglement and compositionality in the field of computer vision. These concepts play a crucial role in enabling machines to understand and interpret visual information with more sophistication and human-like reasoning. Participants will learn about advanced techniques and models that allow for the disentanglement of visual factors in images and the compositionality of these factors to produce more meaningful representations. All in all, Disentanglement and Composition are believed to be one of the possible ways for AI to fundamentally understand the world, and eventually achieve Artificial General Intelligence (AGI).

Bio s:

Workshop: EarthVision: Large Scale Computer Vision for Remote Sensing Imagery Mon 17 Jun 09:00 a.m.

Ronny Haensch

Workshop: Foundation Models for Autonomous Systems Mon 17 Jun 09:00 a.m.

Li Chen

Tutorial: Edward Kim · Sanjit Seshia · Daniel Fremont · Jinkyu Kim · Kimin Lee · Hazem Torfah · Necmiye Ozay · Parasara Sridhar Duggirala · Marcell Vazquez-Chanlatte

SCENIC: An Open-Source Probabilistic Programming System for Data Generation and Safety in AI-Based Autonomy

Autonomous systems, such as self-driving cars or intelligent robots, are increasingly operating in complex, stochastic environments where they dynamically interact with multiple entities (human and robot). There is a need to formally model and generate such environments in simulation, for use cases that span synthetic training data generation and rigorous evaluation of safety. In this tutorial, we provide an in-depth tutorial on Scenic, a simulator-agnostic probabilistic programming language to model complex multi-agent, physical environments with stochasticity and spatio-temporal constraints. Scenic has been used in a variety of domains such as self-driving, aviation, indoor robotics, multi-agent systems, and augmented/virtual reality. Using Scenic and associated open source tools, one can (1) model and sample from distributions with spatial and temporal constraints, (2) generate synthetic data in a controlled, programmatic fashion to train and test machine learning components, (3) reason about the safety of AI-enabled autonomous systems, (4) automatically find edge cases, (5) debug and root-cause failures of AI components including for perception, and (6) bridge the sim-to-real gap in autonomous system design. We will provide a hands-on tutorial on the basics of Scenic and its applications, how to create Scenic programs and your own new applications on top of Scenic, and to interface the language to your simulator/renderer of choice. For more information on Scenic, please visit the website: https://scenic-lang.org

Bio s:

Parasara Sridhar Duggirala

Sridhar Duggirala is an Assistant Professor in the Computer Science Department at University of North Carolina at Chapel Hill. He works on the domains of Formal Methods, Autonomy, and Cyber-Physical Systems. He has received best paper awards at International Conference on Embedded Software and ARCH workshop and received Amazon Research Award.

Tutorial: Zhengyuan Yang · Linjie Li · Zhe Gan · Chunyuan Li · Jianwei Yang

Recent Advances in Vision Foundation Models

This tutorial covers the advanced topics in designing and training vision foundation models, including the state-of-the-art approaches and principles in (i) learning vision foundation models for multimodal understanding and generation, (ii) benchmarking and evaluating vision foundation models, and (iii) agents and other advanced systems based on vision foundation models.

Bio s:

Workshop: AI4Space 2024 Mon 17 Jun 09:00 a.m.

Tat-Jun Chin

Tutorial: Sijia Liu · Yang Liu · Nathalie Baracaldo · Eleni Triantafillou

Machine Unlearning in Computer Vision: Foundations and Applications

This tutorial aims to offer a comprehensive understanding of emerging machine unlearning (MU) techniques. These techniques are designed to accurately assess the impact of specific data points, classes, or concepts (e.g., related to copyrighted information, biases and stereotypes, and personally identifying data) on model performance and efficiently eliminate their potentially harmful influence within a pre-trained model. With the recent shift to foundation models, MU has become indispensable, as re-training from scratch is prohibitively costly in terms of time, computational resources, and finances. Despite increasing research interest, MU for vision tasks remains significantly underexplored compared to its prominence in the security and privacy (SP) field. Within this tutorial, we will delve into the algorithmic foundations of MU methods, including techniques such as localization-informed unlearning, unlearning-focused finetuning, and vision model-specific optimizers. We will provide a comprehensive and clear overview of the diverse range of applications for MU in CV. Furthermore, we will emphasize the importance of unlearning from an industry perspective, where modifying the model during its life-cycle is preferable to re-training it entirely, and where metrics to verify the unlearning process become paramount. Our tutorial will furnish the general audience with sufficient background information to grasp the motivation, research progress, opportunities, and ongoing challenges in MU.

Bio s:

Workshop: Sight and Sound Mon 17 Jun 09:00 a.m.

Andrew Owens

CVPR 2024 Biometrics Workshop Mon 17 Jun 09:00 a.m.

Bir Bhanu

Tutorial: Matteo Poggi

Deep Stereo Matching in the Twenties

For decades, stereo matching has been approached by developing hand-crafted algorithms, focused on measuring the visual appearance between local patterns in the two images and propagating this information globally. Since 2015, deep learning led to a paradigm shift in this field, driving the community to the design of end-to-end deep networks capable of matching pixels. The results of this revolution brought stereo matching to a whole new level of accuracy, yet not without any drawbacks. Indeed, some hard challenges remained unsolved by the first generation of deep stereo models, as they were often not capable of properly generalizing across different domains -- e.g., from synthetic to real, from indoor to outdoor -- or dealing with high-resolution images.

This was, however, three years ago. These and other challenges have been faced by the research community in the Twenties, making deep stereo matching even more mature and suitable to be a practical solution for everyday applications. For instance, now we have networks capable of generalizing much better from synthetic to real images, as well as handling high-resolution images or even estimating disparity correctly in the presence of non-Lambertian surfaces -- known to be among the ill-posed challenges for stereo. Accordingly, in this tutorial, we aim at giving a comprehensive overview of the state-of-the-art of deep stereo matching, which architectural designs have been crucial to reach this level of maturity and how to select the best solution for estimating depth from stereo in real applications.

Bio :

Matteo Poggi

I'm Tenure-Track Assistant professor (RTD-B) at the University of Bologna. I got my MSc and PhD degrees in 2014 and 2018 from the University of Bologna, both working on stereo vision.

7th Workshop on Autonomous Driving (WAD) Mon 17 Jun 09:00 a.m.

Vincent Casser

The CVPR 2024 Workshop on Autonomous Driving (WAD) brings together leading researchers and engineers from academia and industry to discuss the latest advances in autonomous driving. Now in its 7th year, the workshop has been continuously evolving with this rapidly changing field and now covers all areas of autonomy, including perception, behavior prediction and motion planning. In this full-day workshop, our keynote speakers will provide insights into the ongoing commercialization of autonomous vehicles, as well as progress in related fundamental research areas. Furthermore, we will host a series of technical benchmark challenges to help quantify recent advances in the field, and invite authors of accepted workshop papers to present their work.

Workshop: Causal and Object-Centric Representations for Robotics Mon 17 Jun 09:00 a.m.

Efstratios Gavves

Workshop: CV 20/20: A Retrospective Vision Mon 17 Jun 12:45 p.m.

Anand Bhattad

Workshop: Data Curation and Augmentation in Enhancing Medical Imaging Applications Mon 17 Jun 01:00 p.m.

Shuoqi Chen

GenAI Media Generation Challenge for Computer Vision Workshop Mon 17 Jun 01:00 p.m.

Sam Tsai

Workshop on TDLCV: Topological Deep Learning for Computer Vision Mon 17 Jun 01:00 p.m.

Tolga Birdal

Workshop: Image Matching: Local Features and Beyond Mon 17 Jun 01:00 p.m.

Eduard Trulls

2nd Workshop on Embodied "Humans": Symbiotic Intelligence between Virtual Humans and Humanoid Robots Mon 17 Jun 01:00 p.m.

Kwan-Yee Lin

Workshop: Rhobin 2024: The second Rhobin challenge on Reconstruction of Human-Object Interaction Mon 17 Jun 01:20 p.m.

Xi Wang

Tutorial: Benjamin Kimia · Timothy Duff · Ricardo Fabbri · Hongyi Fan

Efficient Homotopy Continuation for Solving Polynomial Systems in Computer Vision Applications

Minimal problems and their solvers play an important role in RANSAC-based approaches to several estimation problems in vision. Minimal solvers solve systems of equations, depending on data, which obey a “conservation of number principle”: for sufficiently generic data, the number of solutions over the complex numbers is constant. Homotopy continuation (HC) methods exploit not just this conservation principle, but also the smooth dependence of solutions on problem data. The classical solution of polynomial systems using Grobner basis, resultants, elimination templates, etc. has been largely successful in smaller problems, but these methods are not able to tackle larger polynomials systems with a larger number of solutions. While HC methods can solve these problems, they have been notoriously slow. Recent research by the presenters and other researchers has enabled efficient HC solvers with the ability for real-time solutions.

The main objective of this tutorial is to make this technology more accessible to the computer vision community. Specifically, after an overview of how such methods can be useful for solving problems in vision (e.g., absolute/relative pose, triangulation), we will describe some of the basic theoretical apparatus underlying HC solvers, including both local and global “probability-1” aspects. On the practical side, we will describe recent advances enabled by GPUs, learning-based approaches, and how to build your own HC-based minimal solvers.

Bio s:

Workshop on Virtual Try-On Mon 17 Jun 01:30 p.m.

Vidya Narayanan

2nd Workshop and Challenge on DeepFake Analysis and Detection Mon 17 Jun 01:30 p.m.

Lorenzo Baraldi

Workshop: Neural Rendering Intelligence Mon 17 Jun 01:30 p.m.

Fangneng Zhan

Workshop: Pixel-level Video Understanding in the Wild Challenge Mon 17 Jun 01:30 p.m.

Henghui Ding

Workshop: Ethical Considerations in Creative Applications of Computer Vision Mon 17 Jun 01:30 p.m.

Negar Rostamzadeh

Tutorial: Orhun Aydin · Philipe Ambrozio Dias · Dalton Lunga

Geospatial Computer Vision and Machine Learning for Large-Scale Earth Observation Data

The 5Vs of big data, volume, value, variety, velocity, and veracity pose immense opportunity and challenges on implementing local and planet-wide solution from Earth observation (EO) data. EO data, residing at the center of various multidisciplinary problems, primarily obtained through satellite imagery, aerial photography, and UAV-based platforms. Understanding Earth Observation data unlocks this immense data source to address planet-scale problems with computer vision and machine learning techniques for geospatial analysis. This workshop introduces current EO data sources, problems, and image-based analysis techniques. The most recent advances in data, models, and open-source analysis ecosystem related to computer vision and deep learning for EO data will be introduced.

Bio s:

Workshop: Multimodalities for 3D Scenes Mon 17 Jun 01:30 p.m.

Changan Chen

Tutorial: Yanwei Fu · Francesco Locatello · Tianjun Xiao · Tong He · Ke Fan

Object-centric Representations in Computer Vision

This tutorial discusses the evolution of object-centric representation in computer vision and deep learning. Initially inspired by decomposing visual scenes into surfaces and objects, recent developments focus on learning causal variables from high-dimensional observations like images or videos. The tutorial covers the objectives of OCL, its development, and connections with machine learning fields, emphasizing object-centric approaches, especially in unsupervised segmentation. Advances in encoder, decoder, and self-supervised learning objectives are explored, with a focus on real-world applications and challenges. The tutorial also introduces open-source tools and showcases breakthroughs in video-based object-centric learning. This tutorial will have four talks covering the basic ideas, learning good features for object-centric learning, video based object-centric representation, and more diverse real-world applications.

Bio s:

Workshop on Graphic Design Understanding and Generation (GDUG) Mon 17 Jun 01:30 p.m.

Kota Yamaguchi

Fifth Workshop on Neural Architecture Search Mon 17 Jun 01:30 p.m.

Stephen McGough

Tutorial: Mohit Prabhushankar · Ghassan AlRegib

Robustness at Inference: Towards Explainability, Uncertainty, and Intervenability

Neural networks provide generalizable and task independent representation spaces that have garnered widespread applicability in image understanding applications. The complicated semantics of feature interactions within image data has been broken down into a set of non-linear functions, convolution parameters, attention, as well as multi-modal inputs among others. The complexity of these operations has introduced multiple vulnerabilities within neural network architectures. These vulnerabilities include adversarial and out-of-distribution samples, confidence calibration issues, and catastrophic forgetting among others. Given that AI promises to herald the fourth industrial revolution, it is critical to understand and overcome these vulnerabilities. Doing so requires creating robust neural networks that drive the AI systems. Defining robustness, however, is not trivial. Simple measurements of invariance to noise and perturbations are not applicable in real life settings. In this tutorial, we provide a human-centric approach to understanding robustness in neural networks that allow AI systems to function in society. Doing so allows us to state the following: 1) All neural networks must provide contextual and relevant explanations to humans, 2) Neural networks must know when and what they don’t know, 3) Neural Networks must be amenable to being intervened upon by humans at decision-making stage. These three statements call for robust neural networks to be explainable, equipped with uncertainty quantification, and be intervenable.

Bio s:

Prof. AlRegib is currently a professor in the School of Electrical and Computer Engineering at the Georgia Institute of Technology. His group is the Omni Lab for Intelligent Visual Engineering and Science (OLIVES) at Georgia Tech. In 2012, he was named the Director of Georgia Tech’s Center for Energy and Geo Processing (CeGP). He is a faculty member of the Center for Signal and Information Processing (CSIP). He also served as the Director of Georgia Tech’s Initiatives and Programs in MENA between 2015 and 2018. He has authored and co-authored more than 230 articles in international journals and conference proceedings. He has been issued several U.S. patents and invention disclosures. He is a Senior Member of the IEEE. Prof. AlRegib received the ECE Outstanding Graduate Teaching Award in 2001 and both the CSIP Research and the CSIP Service Awards in 2003. In 2008, he received the ECE Outstanding Junior Faculty Member Award. In 2017, he received the 2017 Denning Faculty Award for Global Engagement. Prof. AlRegib participated in a number of activities. He is a Technical Program co-Chair for ICIP 2020 in Abu Dhabi. He is a voted member of the IEEE SPS Technical Committees on Multimedia Signal Processing (MMSP) and Image, Video, and Multidimensional Signal Processing (IVMSP), 2015-2017 and 2018-2020. He is a member of the Editorial Boards of both the IEEE Transactions on Image Processing (TIP), 209-present, and the Elsevier Journal Signal Processing: Image Communications, 2014-present. He was a member of the editorial board of the Wireless Networks Journal (WiNET), 2009-2016 and the IEEE Transaction on Circuits and Systems for Video Technology (CSVT), 2014-2016. He was an Area Chair for ICME 2016/17 and the Tutorial Chair for ICIP 2016. He served as the chair of the Special Sessions Program at ICIP’06, the area editor for Columns and Forums in the IEEE Signal Processing Magazine (SPM), 2009–12, the associate editor for IEEE SPM, 2007-09, the Tutorials co-chair in ICIP’09, a guest editor for IEEE J-STSP, 2012, a track chair in ICME’11, the co-chair of the IEEE MMTC Interest Group on 3D Rendering, Processing, and Communications, 2010-12, the chair of the Speech and Video Processing Track at Asilomar 2012, and the Technical Program co-Chair of IEEE GlobalSIP, 2014. He lead a team that is organizing the IEEE VIP Cup, 2017. His research group is working on projects related to machine learning, image and video processing, image and video understanding, seismic imaging, perception in visual data processing, healthcare intelligence, and video analytics. The primary applications of the research span from Autonomous Vehicles to Portable AI-based Ophthalmology and Eye Exam and from Microscopic Imaging to Seismic Interpretation. The group was the first to introduce modern machine learning to seismic interpretation. Prof. AlRegib has provided services and consultation to several firms, companies, and international educational and R&D organizations. He has been a witness experts in a number of patents infringement cases.

3rd Workshop on Vision Datasets Understanding and DataCV Challenge Mon 17 Jun 01:30 p.m.

Liang Zheng

The Seventh International Workshop on Computer Vision for Physiological Measurement (CVPM) Mon 17 Jun 01:30 p.m.

Wenjin Wang

Tutorial: Fabricio Narcizo · Elizabete Munzlinger · Anuj Dutt · Shan Shaffi · Sai Narsi Reddy Donthi Reddy

Edge AI in Action: Practical Approaches to Developing and Deploying Optimized Models

Edge AI refers to artificial intelligence applied to edge devices like smartphones, tablets, laptops, cameras, sensors, and drones. It enables these devices to handle AI tasks autonomously, without cloud or central server connections, offering higher speed, lower latency, greater privacy, and reduced power consumption. Edge AI presents challenges and opportunities in model development and deployment, including size reduction, compression, quantization, and distillation, and involves integrating and communicating between edge devices and the cloud or other devices in a hybrid and distributed architecture. This tutorial provides practical guidance on developing and deploying optimized models for edge AI, covering theoretical and technical aspects, best practices, and real-world case studies focused on computer vision and deep learning models. We demonstrate tools and frameworks like TensorFlow, PyTorch, ONNX, OpenVINO, Google Mediapipe, and Qualcomm SNPE. We will also discuss multi-modal AI applications such as head pose estimation, person segmentation, hand gesture recognition, sound localization, and more. These applications use images, videos, and sounds to create interactive edge AI experiences. The presentation will include developing and deploying these models on Jabra collaborative business cameras and exploring integration with devices like Luxonis OAK-1 MAX, Neural Compute Engine Myriad X, and NVIDIA Jetson Nano Developer Kit.

Bio s:

Fabricio Narcizo

Fabricio Batista Narcizo received his Bachelor of Science (B.Sc) degree in Computer Science from the University of the West of Santa Catarina (UNOESC) in 2005, his Master of Science (M.Sc.) degree in Computer & Electronic Engineering from the Technological Institute of Aeronautics (ITA) in 2008, and his Doctor of Philosophy (Ph.D.) degree in Computer Science from the IT University of Copenhagen (ITU) in 2017. During his Ph.D. studies, he had financial support from the Brazilian National Council for Scientific and Technological Development (CNPq) through Science without Borders program (SwB). His Ph.D. research project aimed at developing eye-tracking systems for elite sports. Nowadays, he is a Part-Time Lecture and Course Manager in the Computer Science Department (CS) of the IT University of Copenhagen and an AI and ML Researcher in GN Audio A/S (Jabra). He was an assistant professor in three different information technology courses (Computer Science, Information Systems, and Data Processing) at the Bandeirante University of São Paulo (UNIBAN), the University July 9th (UNINOVE), and the Mackenzie Presbyterian University (Mackenzie) between February 2007 and January 2012. He has research experience in Computer Science with an emphasis on eye tracking for disabled people. His academic career drove his interest in computer science, computer vision, image analysis, machine learning, deep learning, human-computer interaction, and eye tracking.

June 17, 2024

Registration Desk: Registration / Badge Pickup Mon 17 Jun 07:00 a.m.

9th New Trends in Image Restoration and Enhancement Workshop and Challenges Mon 17 Jun 08:00 a.m.

Workshop: Domain adaptation, Explainability and Fairness in AI for Medical Image Analysis (DEF-AI-MIA) Mon 17 Jun 08:00 a.m.

Workshop: Computer Vision for Mixed Reality Mon 17 Jun 08:00 a.m.

Workshop: 8th AI City Challenge Mon 17 Jun 08:00 a.m.

Workshop: Efficient Large Vision Models Mon 17 Jun 08:00 a.m.

Workshop: SyntaGen: Harnessing Generative Models for Synthetic Visual Datasets Mon 17 Jun 08:25 a.m.

Multimodal Algorithmic Reasoning Workshop Mon 17 Jun 08:25 a.m.

1st Workshop on Dataset Distillation for Computer Vision Mon 17 Jun 08:30 a.m.

Workshop: CV4Science 2025: Using Computer Vision for the Sciences Mon 17 Jun 08:30 a.m.

5th International Workshop on Large Scale Holistic Video Understanding Mon 17 Jun 08:30 a.m.

2nd Workshop on Foundation Models Mon 17 Jun 08:30 a.m.

Workshop: AIS: Vision, Graphics and AI for Streaming Mon 17 Jun 08:30 a.m.

4th Workshop on Physics Based Vision meets Deep Learning (PBDL2024) Mon 17 Jun 08:30 a.m.

First Workshop on Efficient and On-Device Generation (EDGE) Mon 17 Jun 08:30 a.m.

4th International Workshop on Long-form Video Understanding: Towards Multimodal AI Assistant and Copilot Mon 17 Jun 08:30 a.m.

Workshop: Foundation Models for Medical Vision Mon 17 Jun 08:30 a.m.

The 4th Workshop of Adversarial Machine Learning on Computer Vision: Robustness of Foundation Models Mon 17 Jun 08:30 a.m.

The 3rd International Workshop on Federated Learning for Computer Vision (FedVision-2024) Mon 17 Jun 08:30 a.m.

4th Mobile AI Workshop and Challenges Mon 17 Jun 08:30 a.m.

Workshop: Computer Vision in the Wild Mon 17 Jun 08:30 a.m.

MetaFood Workshop (MTF) Mon 17 Jun 08:30 a.m.

4th Workshop on CV4Animals: Computer Vision for Animal Behavior Tracking and Modeling Mon 17 Jun 08:30 a.m.

1st Workshop on Urban Scene Modeling: Where Vision Meets Photogrammetry and Graphics Mon 17 Jun 08:30 a.m.

The Fifth Workshop on Fair, Data-efficient, and Trusted Computer Vision Mon 17 Jun 08:30 a.m.

2nd Workshop on Multimodal Content Moderation Mon 17 Jun 08:30 a.m.

The 5th Face Anti-Spoofing Workshop Mon 17 Jun 08:30 a.m.

Workshop on Computer Vision for Fashion, Art, and Design Mon 17 Jun 08:30 a.m.

Workshop: AI for 3D Generation Mon 17 Jun 08:30 a.m.

Workshop: AI for Content Creation (AI4CC) Mon 17 Jun 08:30 a.m.

The 7th Workshop and Challenge Bridging the Gap between Computational Photography and Visual Recognition (UG2+) Mon 17 Jun 08:30 a.m.

Workshop: ViLMa – Visual Localization and Mapping Mon 17 Jun 08:30 a.m.

Workshop: New Challenges in 3D Human Understanding Mon 17 Jun 08:30 a.m.

First Joint Egocentric Vision (EgoVis) Workshop Mon 17 Jun 08:30 a.m.

Workshop: VAND 2.0: Visual Anomaly and Novelty Detection Mon 17 Jun 08:30 a.m.

2nd Workshop on Scene Graphs and Graph Representation Learning Mon 17 Jun 08:30 a.m.

Tool-Augmented VIsion Workshop Mon 17 Jun 08:45 a.m.

Second Workshop for Learning 3D with Multi-View Supervision Mon 17 Jun 08:45 a.m.

Workshop: Prompting in Vision Mon 17 Jun 09:00 a.m.

Tutorial: Xin Jin · Wenjun Zeng · Tao Yang · Yue Song · Nicu Sebe · Xingyi Yang · Xinchao Wang · Shuicheng Yan

Workshop: EarthVision: Large Scale Computer Vision for Remote Sensing Imagery Mon 17 Jun 09:00 a.m.

Workshop: Foundation Models for Autonomous Systems Mon 17 Jun 09:00 a.m.

Tutorial: Edward Kim · Sanjit Seshia · Daniel Fremont · Jinkyu Kim · Kimin Lee · Hazem Torfah · Necmiye Ozay · Parasara Sridhar Duggirala · Marcell Vazquez-Chanlatte

Tutorial: Zhengyuan Yang · Linjie Li · Zhe Gan · Chunyuan Li · Jianwei Yang

Workshop: AI4Space 2024 Mon 17 Jun 09:00 a.m.

Tutorial: Sijia Liu · Yang Liu · Nathalie Baracaldo · Eleni Triantafillou

Workshop: Sight and Sound Mon 17 Jun 09:00 a.m.

CVPR 2024 Biometrics Workshop Mon 17 Jun 09:00 a.m.

Tutorial: Matteo Poggi

7th Workshop on Autonomous Driving (WAD) Mon 17 Jun 09:00 a.m.

Workshop: Causal and Object-Centric Representations for Robotics Mon 17 Jun 09:00 a.m.

Workshop: CV 20/20: A Retrospective Vision Mon 17 Jun 12:45 p.m.

Workshop: Data Curation and Augmentation in Enhancing Medical Imaging Applications Mon 17 Jun 01:00 p.m.

GenAI Media Generation Challenge for Computer Vision Workshop Mon 17 Jun 01:00 p.m.

Workshop on TDLCV: Topological Deep Learning for Computer Vision Mon 17 Jun 01:00 p.m.

Workshop: Image Matching: Local Features and Beyond Mon 17 Jun 01:00 p.m.

2nd Workshop on Embodied "Humans": Symbiotic Intelligence between Virtual Humans and Humanoid Robots Mon 17 Jun 01:00 p.m.

Workshop: Rhobin 2024: The second Rhobin challenge on Reconstruction of Human-Object Interaction Mon 17 Jun 01:20 p.m.

Tutorial: Benjamin Kimia · Timothy Duff · Ricardo Fabbri · Hongyi Fan

Workshop on Virtual Try-On Mon 17 Jun 01:30 p.m.

2nd Workshop and Challenge on DeepFake Analysis and Detection Mon 17 Jun 01:30 p.m.

Workshop: Neural Rendering Intelligence Mon 17 Jun 01:30 p.m.

Workshop: Pixel-level Video Understanding in the Wild Challenge Mon 17 Jun 01:30 p.m.

Workshop: Ethical Considerations in Creative Applications of Computer Vision Mon 17 Jun 01:30 p.m.

Tutorial: Orhun Aydin · Philipe Ambrozio Dias · Dalton Lunga

Workshop: Multimodalities for 3D Scenes Mon 17 Jun 01:30 p.m.

Tutorial: Yanwei Fu · Francesco Locatello · Tianjun Xiao · Tong He · Ke Fan

Workshop on Graphic Design Understanding and Generation (GDUG) Mon 17 Jun 01:30 p.m.

Fifth Workshop on Neural Architecture Search Mon 17 Jun 01:30 p.m.

Tutorial: Mohit Prabhushankar · Ghassan AlRegib

3rd Workshop on Vision Datasets Understanding and DataCV Challenge Mon 17 Jun 01:30 p.m.

The Seventh International Workshop on Computer Vision for Physiological Measurement (CVPM) Mon 17 Jun 01:30 p.m.

Tutorial: Fabricio Narcizo · Elizabete Munzlinger · Anuj Dutt · Shan Shaffi · Sai Narsi Reddy Donthi Reddy