Toggle Poster Visibility
Oral
None
Adversarial Style Optimization: Enhancing VLM Jailbreaks by GRPO-based Stylistic Triggers Optimization
Registration
Tue Jun 02 01:00 PM -- 07:00 PM (PDT) @ Lobby A None
Registration / Badge Pickup
Break
Wed Jun 03 06:00 AM -- 08:00 AM (PDT) @ ExHall C None
Breakfast
Registration
Wed Jun 03 06:00 AM -- 04:00 PM (PDT) @ Lobby A None
Registration / Badge Pickup
Tutorial
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ 301/302 None
The Principles of Diffusion Models: Real-Time Continuous & Discrete Diffusion
Tutorial
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ Mile High 2B None
Tom Builds, Tom Breaks: Hands-On Attacks and Defenses for Vision-Language Systems
Tutorial
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ Mile High 3C None
Towards Safe Multi-Modal Learning: Evolving Threats and Safety Solutions
Tutorial
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ 702 None
Edge AI in Action: Mastering On-Device Inference
Workshop
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ Four Seasons 4 None
Workshop on "Bitter Lessons"
Workshop
Wed Jun 03 07:00 AM -- 11:30 AM (PDT) @ 111 None
Generative AI for XR and Identity-based Applications
Workshop
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ 506 None
GRAIL-V: Grounded Retrieval & Agentic Intelligence for Vision-Language
Workshop
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ 505 None
The 3rd Workshop on Human Motion Generation - New Perspective on Simulation, Animation, and VR applications
Workshop
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ 106 None
LatinX in Computer Vision Research Workshop
Workshop
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ Mile High 1CD None
Multimodal Foundation Models for Biomedicine: Challenges and Opportunities
Workshop
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ 601 None
The 2nd Workshop on Multimodal Spatial Intelligence
Workshop
Wed Jun 03 07:00 AM -- 11:45 AM (PDT) @ Mile High 4EF None
On Sensor Vision Workshop
Workshop
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ 712 None
22nd Workshop on Perception Beyond the Visible Spectrum
Workshop
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ 109 None
The 2nd International Workshop & Challenge on Subtle Visual Computing @CVPR 2026
Workshop
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ 705/707 None
1st Workshop on Video World Models: Interaction, Memory, and Efficiency
Workshop
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ 708 None
Women in Computer Vision
Workshop
Wed Jun 03 07:00 AM -- 11:00 AM (PDT) @ Mile High 2A None
Workshop on World Models Meet Active Sensing and Closed-Loop Planning
Workshop
Wed Jun 03 07:00 AM -- 11:30 AM (PDT) @ Mile High 1AB None
The 5th Explainable AI for Computer Vision (XAI4CV) Workshop
Workshop
Wed Jun 03 07:00 AM -- 11:30 AM (PDT) @ 108 None
PHAROS AI Factory for Medical Imaging & Healthcare
Workshop
Wed Jun 03 07:00 AM -- 04:00 PM (PDT) @ Mile High 1EF None
Workshop on Agentic AI for Visual Media
Workshop
Wed Jun 03 07:00 AM -- 04:00 PM (PDT) @ 503 None
Bridging Vision, Language, and Action: What’s Missing in Actionable Visual Perception for Robotics
Workshop
Wed Jun 03 07:00 AM -- 04:00 PM (PDT) @ Mile High 3A None
Autonomous Understanding Through Open-world Perception and Integrated Language models for On-road Tasks
Workshop
Wed Jun 03 07:00 AM -- 05:00 PM (PDT) @ 207 None
Foundation Models for V2X-Based Cooperative Autonomous Driving
Workshop
Wed Jun 03 07:00 AM -- 04:00 PM (PDT) @ Four Seasons 1 None
From Lab Demos to Daily Tasks: Embodied Intelligence in the Wild
Workshop
Wed Jun 03 07:00 AM -- 04:00 PM (PDT) @ 504 None
13th Workshop on Fine-grained Visual Categorization
Workshop
Wed Jun 03 07:00 AM -- 04:00 PM (PDT) @ 205 None
4th Workshop on Vision Based Industrial Inspection
Workshop
Wed Jun 03 07:00 AM -- 04:00 PM (PDT) @ Four Seasons 2 None
The 1st Workshop on Deployment of Foundation Models for Embodied AI
Workshop
Wed Jun 03 07:15 AM -- 12:00 PM (PDT) @ 102/104 None
Workshop on Vision-based Assistants in the Real-World
Workshop
Wed Jun 03 07:20 AM -- 11:30 AM (PDT) @ 113 None
Multimodal Alignment for a Pluralistic Society
Workshop
Wed Jun 03 07:25 AM -- 11:35 AM (PDT) @ 501 None
AI for Creative Visual Content Generation, Editing and Understanding
Workshop
Wed Jun 03 07:25 AM -- 12:00 PM (PDT) @ 203 None
IPA: Interactive Physical AI Workshop
Workshop
Wed Jun 03 07:30 AM -- 11:30 AM (PDT) @ 610/612 None
AI for Content Creation
Workshop
Wed Jun 03 07:30 AM -- 11:30 AM (PDT) @ Mile High 4AB None
The 3rd AI for Visual Arts Workshop and Challenges
Workshop
Wed Jun 03 07:30 AM -- 11:30 AM (PDT) @ 710 None
The 5th DataCV Workshop and Challenge
Workshop
Wed Jun 03 07:30 AM -- 10:59 AM (PDT) @ 711 None
The 5th Workshop on Federated Learning for Computer Vision
Workshop
Wed Jun 03 07:30 AM -- 11:30 AM (PDT) @ 112 None
Generative AI for Sign Language
Workshop
Wed Jun 03 07:30 AM -- 04:00 PM (PDT) @ Mile High 2C None
Sense of Space: Multi-Sensory Modeling for Embodied Intelligence
Workshop
Wed Jun 03 07:30 AM -- 05:00 PM (PDT) @ 703 None
Visual General Intelligence
Workshop
Wed Jun 03 07:30 AM -- 11:30 AM (PDT) @ 507 None
AI4RWC: The 2nd International Workshop on Vision Intelligence for Real-world Challenges
Workshop
Wed Jun 03 07:45 AM -- 04:00 PM (PDT) @ Mile High 4CD None
Computational Cameras and Displays
Workshop
Wed Jun 03 07:45 AM -- 04:50 PM (PDT) @ 704/706 None
Third Joint Egocentric Vision (EgoVis) Workshop
Workshop
Wed Jun 03 07:50 AM -- 11:30 AM (PDT) @ 110 None
AERO-HPR: Human Perception and Recognition in Aerial Surveillance
Workshop
Wed Jun 03 07:50 AM -- 11:30 AM (PDT) @ 107 None
2nd Workshop on Photorealistic 3D Head Avatars
Workshop
Wed Jun 03 07:50 AM -- 02:30 PM (PDT) @ 502 None
Efficient Deep Learning for Computer Vision
Tutorial
Wed Jun 03 08:00 AM -- 11:15 AM (PDT) @ 201 None
Accelerated Diffusion Models: From Theory to Interactive World Models
Workshop
Wed Jun 03 08:00 AM -- 12:00 PM (PDT) @ 105 None
The 3rd Workshop on AI for Content Generation, Quality Enhancement and Streaming
Workshop
Wed Jun 03 08:00 AM -- 11:30 AM (PDT) @ 709 None
The 22nd Embedded Vision Workshop
Workshop
Wed Jun 03 08:00 AM -- 11:30 AM (PDT) @ 607 None
The 3rd Workshop on Foundation Models for Medical Vision
Workshop
Wed Jun 03 08:00 AM -- 03:00 PM (PDT) @ 605 None
12th Workshop on Medical Computer Vision
Workshop
Wed Jun 03 08:00 AM -- 05:00 PM (PDT) @ Mile High 3B None
Urban Scene Modeling: Structured, Semantic, and Synthetic 3D Habitats
Workshop
Wed Jun 03 08:15 AM -- 05:00 PM (PDT) @ 603 None
Workshop on Autonomous Driving
Break
Wed Jun 03 09:00 AM -- 10:00 AM (PDT) @ ExHall A None
Coffee Break
Tutorial
Wed Jun 03 12:00 PM -- 04:00 PM (PDT) @ Mile High 3C None
Principled Interpretability in Vision Models: From Mechanistic Understanding to Interpretable Models by Design
Tutorial
Wed Jun 03 12:00 PM -- 04:00 PM (PDT) @ 702 None
Monte Carlo physical simulation
Tutorial
Wed Jun 03 12:00 PM -- 04:00 PM (PDT) @ 201 None
Building GenAI based Simulation Environment for End-to-End Autonomous Driving
Tutorial
Wed Jun 03 12:00 PM -- 04:00 PM (PDT) @ 301/302 None
From Perception to Simulation: The Emergence of World Models in Multi-modal Reasoning
Workshop
Wed Jun 03 12:00 PM -- 05:00 PM (PDT) @ 507 None
GigaBrain Challenge 2026: Workshop on World Models Empowering Vision Language Action Model
Workshop
Wed Jun 03 12:00 PM -- 04:45 PM (PDT) @ 106 None
The Second CVPR Workshop on Foundation and Large Vision Models in Remote Sensing (MORSE)
Workshop
Wed Jun 03 12:00 PM -- 05:15 PM (PDT) @ Mile High 1CD None
The 2nd 3D-LLM/VLA Workshop: Bridging Language, Vision and Action in 3D Environments
Workshop
Wed Jun 03 12:00 PM -- 05:00 PM (PDT) @ 505 None
10th Affective & Behavior Analysis in-the-wild
Workshop
Wed Jun 03 12:00 PM -- 05:00 PM (PDT) @ 103 None
Authenticity & Provenance in the age of Generative AI
Workshop
Wed Jun 03 12:00 PM -- 04:00 PM (PDT) @ 711 None
Auto-Annotation with Expert-Crafted Guidelines
Workshop
Wed Jun 03 12:00 PM -- 04:00 PM (PDT) @ 610/612 None
Cognitive Foundations for Multimodal Models
Workshop
Wed Jun 03 12:00 PM -- 05:00 PM (PDT) @ 109 None
Computer Vision for the Built World
Workshop
Wed Jun 03 12:00 PM -- 04:00 PM (PDT) @ 102/104 None
Computer Vision with Small Data: Beyond Scale -- Toward Data-Efficient Dynamically-Aware Video Intelligence
Workshop
Wed Jun 03 12:00 PM -- 05:00 PM (PDT) @ 112 None
Computer Vision for Biomechanics Workshop
Workshop
Wed Jun 03 12:00 PM -- 04:00 PM (PDT) @ 110 None
Sixth Workshop on Neural Architecture Search
Workshop
Wed Jun 03 12:00 PM -- 05:05 PM (PDT) @ 111 None
DataMFM: Emerging Directions in Data for Multimodal Foundation Models
Workshop
Wed Jun 03 12:00 PM -- 05:00 PM (PDT) @ 501 None
End-to-End 3D Learning
Workshop
Wed Jun 03 12:00 PM -- 05:00 PM (PDT) @ 203 None
3rd Workshop on Efficient and On-Device Generation (EDGE), CVPR 2026
Workshop
Wed Jun 03 12:00 PM -- 04:00 PM (PDT) @ Mile High 4EF None
1st Workshop on Multi-Agent Robotic Systems: Scaling with Compositional Intelligence
Workshop
Wed Jun 03 12:00 PM -- 05:00 PM (PDT) @ Four Seasons 4 None
The 5th Workshop on “What is Next in Multimodal Foundation Models?”
Workshop
Wed Jun 03 12:00 PM -- 05:00 PM (PDT) @ 601 None
Workshop on Multimodal Human Motion Analysis
Workshop
Wed Jun 03 12:00 PM -- 04:00 PM (PDT) @ 105 None
The 1st Workshop on Monitoring the World through an Imperfect Lens
Workshop
Wed Jun 03 12:00 PM -- 04:30 PM (PDT) @ 210/212 None
2nd Workshop on Multimodal Sign Language Recognition
Workshop
Wed Jun 03 12:00 PM -- 05:00 PM (PDT) @ 709 None
The 3rd MetaFood Workshop (MTF)
Workshop
Wed Jun 03 12:00 PM -- 05:00 PM (PDT) @ Mile High 1AB None
Machine Unlearning for Vision
Workshop
Wed Jun 03 12:00 PM -- 04:00 PM (PDT) @ 705/707 None
OpenSUN3D: 6th Workshop on Open-World 3D Scene Understanding with Foundation Models
Workshop
Wed Jun 03 12:00 PM -- 05:00 PM (PDT) @ 107 None
Synthetic & Adversarial ForEnsics
Workshop
Wed Jun 03 12:00 PM -- 04:30 PM (PDT) @ 710 None
3rd Workshop on ScanNet++ Novel View Synthesis and 3D Semantic Understanding Challenge
Workshop
Wed Jun 03 12:00 PM -- 05:00 PM (PDT) @ Mile High 2A None
The 7th International Workshop and CVML Challenge on Agriculture-Vision: Challenges & Opportunities for Computer Vision in Agriculture
Workshop
Wed Jun 03 12:00 PM -- 04:00 PM (PDT) @ 108 None
The 1st Workshop on Vision for Intelligent Task Assistants
Workshop
Wed Jun 03 12:15 PM -- 05:00 PM (PDT) @ 113 None
Second Workshop on Foundation and Generative Models in Biometrics
Workshop
Wed Jun 03 12:20 PM -- 04:30 PM (PDT) @ Mile High 4AB None
Rediscovering Intelligence: Can AI Still Learn from Humans?
Workshop
Wed Jun 03 12:25 PM -- 04:30 PM (PDT) @ 506 None
The 2nd Workshop on Test-time Scaling for Computer Vision
Tutorial
Wed Jun 03 12:30 PM -- 03:45 PM (PDT) @ Mile High 2B None
3D Human Mesh Modeling and Recovery from RGB and LiDAR
Workshop
Wed Jun 03 12:30 PM -- 04:45 PM (PDT) @ 708 None
Spatial Intelligence for Cultural Heritage
Workshop
Wed Jun 03 12:45 PM -- 04:40 PM (PDT) @ 607 None
The 5th Workshop on Transformers for Vision and Multimodal AI
Workshop
Wed Jun 03 01:00 PM -- 04:00 PM (PDT) @ 712 None
The 1st Workshop on AI-assisted Long Video Creation
Break
Wed Jun 03 02:00 PM -- 03:00 PM (PDT) @ ExHall A None
Coffee Break
Break
Thu Jun 04 06:00 AM -- 08:00 AM (PDT) @ ExHall C None
Breakfast
Registration
Thu Jun 04 06:00 AM -- 04:00 PM (PDT) @ Lobby A None
Registration / Badge Pickup
Workshop
Thu Jun 04 06:30 AM -- 11:30 AM (PDT) @ 105 None
3D Geometry Generation for Scientific Computing (2nd Edition)
Workshop
Thu Jun 04 06:30 AM -- 11:30 AM (PDT) @ 704/706 None
2nd Workshop on Knowledge-Intensive Multimodal Reasoning
Tutorial
Thu Jun 04 07:00 AM -- 11:00 AM (PDT) @ 702 None
Extending Computer Vision to Hidden Objects: A Tutorial on Millimeter-Wave Imaging and Reconstruction of Occluded Scenes
Tutorial
Thu Jun 04 07:00 AM -- 11:00 AM (PDT) @ Mile High 2B None
The Full Stack of Physical AI: Simulation, Foundation Models, and Edge Deployment for Next-Generation Robotics Applications
Tutorial
Thu Jun 04 07:00 AM -- 11:00 AM (PDT) @ 201 None
Recent Advances in AI for Medical Imaging: Progress, Challenges, and Future Directions
Tutorial
Thu Jun 04 07:00 AM -- 11:00 AM (PDT) @ 203 None
Computer Vision at Scale: Multi-Camera Tracking, Calibration, and Event Detection for Checkout-Free Retail
Workshop
Thu Jun 04 07:00 AM -- 11:30 AM (PDT) @ 703 None
Third Workshop for Learning 3D with Multi-View Supervision
Workshop
Thu Jun 04 07:00 AM -- 11:00 AM (PDT) @ 610/612 None
6th Workshop on 3D Scene Understanding for Vision, Graphics, and Robotics
Workshop
Thu Jun 04 07:00 AM -- 11:00 AM (PDT) @ 502 None
Workshop on Any-to-any Multimodal Learning
Workshop
Thu Jun 04 07:00 AM -- 11:30 AM (PDT) @ 102/104 None
The 3rd Workshop on New Trends in AI-Generated Media and Security
Workshop
Thu Jun 04 07:00 AM -- 11:30 AM (PDT) @ 106 None
2nd Workshop on Computer Vision for Children
Workshop
Thu Jun 04 07:00 AM -- 11:00 AM (PDT) @ Four Seasons 2 None
The 5th Workshop on Computer Vision in the Wild: Towards Unified Multimodal Agents For Reasoning in the Wild
Workshop
Thu Jun 04 07:00 AM -- 11:00 AM (PDT) @ Mile High 2C None
The Second Workshop on the Evaluation of the Generative Foundation Models
Workshop
Thu Jun 04 07:00 AM -- 11:30 AM (PDT) @ 607 None
Geometry-Free Novel View Synthesis and Controllable Video Models
Workshop
Thu Jun 04 07:00 AM -- 11:00 AM (PDT) @ 710 None
Humans of Generative AI
Workshop
Thu Jun 04 07:00 AM -- 11:30 AM (PDT) @ 504 None
The 1st Workshop on Low‑Level Vision Frontiers with Generative AI, Preference Optimization, and Agentic Systems
Workshop
Thu Jun 04 07:00 AM -- 11:10 AM (PDT) @ 711 None
6th Omnidirectional Computer Vision Workshop
Workshop
Thu Jun 04 07:00 AM -- 11:00 AM (PDT) @ 712 None
Open-World Vision
Workshop
Thu Jun 04 07:00 AM -- 11:20 AM (PDT) @ 113 None
From Perception to Persuasion: Challenges and Advances in Misinformation Detection in Society
Workshop
Thu Jun 04 07:00 AM -- 11:00 AM (PDT) @ Mile High 3A None
SPAR-3D: Security, Privacy, and Adversarial Robustness in 3D Generative Vision Models
Workshop
Thu Jun 04 07:00 AM -- 11:30 AM (PDT) @ 705/707 None
Trustworthy, Robust, Uncertainty-Aware, and Explainable Visual Intelligence and Beyond
Workshop
Thu Jun 04 07:00 AM -- 11:00 AM (PDT) @ Mile High 4EF None
The 8th UG2+ Workshop and Challenge: Bridging the Gap between Computational Photography and Visual Perception
Workshop
Thu Jun 04 07:00 AM -- 11:30 AM (PDT) @ Mile High 4AB None
Unified Robotic Vision with Cross-Modal Sensing and Alignment
Workshop
Thu Jun 04 07:00 AM -- 11:00 AM (PDT) @ 506 None
9th International Workshop on Visual Odometry and Computer Vision Applications Based on Location Clues
Workshop
Thu Jun 04 07:00 AM -- 04:00 PM (PDT) @ 112 None
11th Workshop on Computer Vision and Multimodal Microscopy Image Analysis
Workshop
Thu Jun 04 07:00 AM -- 04:00 PM (PDT) @ 107 None
The Seventh Annual Embodied Artificial Intelligence Workshop
Workshop
Thu Jun 04 07:00 AM -- 04:00 PM (PDT) @ Mile High 2A None
2nd Workshop on Agents in Interaction, from Humans to Robots
Workshop
Thu Jun 04 07:00 AM -- 04:00 PM (PDT) @ 505 None
Mobile AI workshop and associated challenges, 6th edition
Workshop
Thu Jun 04 07:00 AM -- 04:00 PM (PDT) @ Four Seasons 1 None
Multi-Agent Embodied Intelligent Systems Meet Agentic-AI era: Opportunities, Challenges and Futures
Workshop
Thu Jun 04 07:00 AM -- 05:00 PM (PDT) @ 207 None
11th New Trends in Image Restoration and Enhancement Workshop and Challenges
Workshop
Thu Jun 04 07:00 AM -- 04:00 PM (PDT) @ Mile High 3B None
Video Generative Models: Benchmarks and Evaluation
Workshop
Thu Jun 04 07:00 AM -- 04:00 PM (PDT) @ Four Seasons 4 None
2nd Workshop on Video Large Language Models
Workshop
Thu Jun 04 07:00 AM -- 04:00 PM (PDT) @ 501 None
Workshop on Visual Concepts
Workshop
Thu Jun 04 07:00 AM -- 04:00 PM (PDT) @ Mile High 1CD None
Sight and Sound
Workshop
Thu Jun 04 07:10 AM -- 11:30 AM (PDT) @ Mile High 1AB None
4th Workshop on Maritime Computer Vision
Tutorial
Thu Jun 04 07:30 AM -- 04:00 PM (PDT) @ Mile High 3C None
Analytic understanding of diffusion models
Workshop
Thu Jun 04 07:30 AM -- 11:30 AM (PDT) @ 108 None
6th Workshop on CV4Animals: Computer Vision for Animal Behavior Tracking and Modeling
Workshop
Thu Jun 04 07:30 AM -- 11:00 AM (PDT) @ 603 None
Exploring the Next Generation of Data
Workshop
Thu Jun 04 07:30 AM -- 11:30 AM (PDT) @ Mile High 4CD None
Personalization in Generative AI Workshop
Workshop
Thu Jun 04 07:30 AM -- 11:30 AM (PDT) @ 110 None
PhysHuman: Physically Grounded Human Perception and Modeling
Workshop
Thu Jun 04 07:30 AM -- 11:30 AM (PDT) @ 103 None
Safe Artificial Intelligence for All Domains
Workshop
Thu Jun 04 07:45 AM -- 11:20 AM (PDT) @ 709 None
VizWiz Grand Challenge: Interpreting Images and Videos Taken by Blind People
Workshop
Thu Jun 04 07:45 AM -- 04:00 PM (PDT) @ 205 None
4th Workshop on Generative Models for Computer Vision
Workshop
Thu Jun 04 07:45 AM -- 05:30 PM (PDT) @ 111 None
9th Multimodal Learning and Applications Workshop
Workshop
Thu Jun 04 07:55 AM -- 11:30 AM (PDT) @ 601 None
Multimodal Algorithmic Reasoning Workshop
Tutorial
Thu Jun 04 08:00 AM -- 04:30 PM (PDT) @ 301/302 None
All You Need To Know About Self-Driving
Workshop
Thu Jun 04 08:00 AM -- 11:15 AM (PDT) @ 210/212 None
The Eighth Workshop on Precognition: Seeing through the Future
Workshop
Thu Jun 04 08:00 AM -- 04:00 PM (PDT) @ 708 None
The 6th Workshop of Adversarial Machine Learning on Computer Vision: Safety of Vision-Language Agents
Workshop
Thu Jun 04 08:00 AM -- 04:30 PM (PDT) @ 503 None
12th IEEE International Workshop on Computer Vision in Sports
Workshop
Thu Jun 04 08:00 AM -- 04:00 PM (PDT) @ 507 None
EarthVision: Large Scale Computer Vision for Remote Sensing Imagery
Workshop
Thu Jun 04 08:00 AM -- 04:00 PM (PDT) @ 605 None
Embodied Reasoning in Action: Workshop and Challenge on Embodied Reasoning for Robotic Manipulation
Workshop
Thu Jun 04 08:00 AM -- 04:30 PM (PDT) @ 109 None
2nd Workshop on Human-Interactive Generation and Editing
Workshop
Thu Jun 04 08:00 AM -- 04:00 PM (PDT) @ Mile High 1EF None
How Do Vision Models Work?
Break
Thu Jun 04 09:00 AM -- 10:00 AM (PDT) @ ExHall A None
Coffee Break
Tutorial
Thu Jun 04 12:00 PM -- 04:00 PM (PDT) @ 201 None
The Road to Convergence: Evolution of Unified Multimodal Models
Tutorial
Thu Jun 04 12:00 PM -- 04:00 PM (PDT) @ Mile High 2B None
Foundations and Frontiers of Watermarking: Algorithms, Multimodal Extensions, Benchmarks, and Authenticity Frameworks
Tutorial
Thu Jun 04 12:00 PM -- 04:00 PM (PDT) @ 702 None
From Perception to Action: Building Efficient and Deployable Robot Intelligence Pipelines with Open-Source Edge AI Toolkits
Workshop
Thu Jun 04 12:00 PM -- 05:00 PM (PDT) @ 603 None
1st Workshop on Generative 3D Reconstruction
Workshop
Thu Jun 04 12:00 PM -- 05:00 PM (PDT) @ 110 None
Medical Reasoning with Vision Language Foundation Models
Workshop
Thu Jun 04 12:00 PM -- 05:00 PM (PDT) @ Mile High 2C None
4D Digital Twins: Real-to-Sim-to-Real for Physical AI
Workshop
Thu Jun 04 12:00 PM -- 05:00 PM (PDT) @ 506 None
2nd Workshop on 4D Vision: Modeling the Dynamic World
Workshop
Thu Jun 04 12:00 PM -- 04:00 PM (PDT) @ 710 None
Artificial Intelligence for Space
Workshop
Thu Jun 04 12:00 PM -- 04:00 PM (PDT) @ 105 None
2nd Workshop on GenAI for Storytelling
Workshop
Thu Jun 04 12:00 PM -- 04:00 PM (PDT) @ Four Seasons 2 None
Big Model Adaptation In Computer Vision
Workshop
Thu Jun 04 12:00 PM -- 04:00 PM (PDT) @ 106 None
CVPR 2026 Biometrics Workshop
Workshop
Thu Jun 04 12:00 PM -- 05:00 PM (PDT) @ Mile High 1AB None
Bridging AI and Medical Reality: Computer Vision for Real-world Clinical Translation
Workshop
Thu Jun 04 12:00 PM -- 05:00 PM (PDT) @ 113 None
Computer Vision × Education: Building a Cross‑Community Agenda for Multimodal Vision in Classrooms
Workshop
Thu Jun 04 12:00 PM -- 04:45 PM (PDT) @ 709 None
CV4Science: Using Computer Vision for the Sciences
Workshop
Thu Jun 04 12:00 PM -- 05:00 PM (PDT) @ 103 None
Domain Generalization: Evolution, Breakthroughs, and Future Horizons (2nd Edition)
Workshop
Thu Jun 04 12:00 PM -- 05:00 PM (PDT) @ 703 None
The 2nd CVPR Workshop on Foundation Models Meet Embodied Agents
Workshop
Thu Jun 04 12:00 PM -- 04:00 PM (PDT) @ 711 None
The 7th International Workshop on Eye and Gaze in Computer Vision
Workshop
Thu Jun 04 12:00 PM -- 05:00 PM (PDT) @ 504 None
Eighth Workshop on Image Matching: Local Features and Beyond
Workshop
Thu Jun 04 12:00 PM -- 05:00 PM (PDT) @ Mile High 4CD None
1st Workshop on Journey to the Awards: Generative AI for Movie-Grade Video Production (J2A), CVPR 2026
Workshop
Thu Jun 04 12:00 PM -- 04:00 PM (PDT) @ Mile High 3A None
The 2nd Workshop on Multi-Modal Reasoning for Agentic Intelligence
Workshop
Thu Jun 04 12:00 PM -- 05:00 PM (PDT) @ 203 None
4D World Models: Bridging Generation and Reconstruction
Workshop
Thu Jun 04 12:00 PM -- 04:00 PM (PDT) @ 102/104 None
Third Workshop on Simulation for Autonomous Driving
Workshop
Thu Jun 04 12:00 PM -- 04:00 PM (PDT) @ 610/612 None
ScaleBot: The First Workshop on Scalable Robot Learning Systems
Workshop
Thu Jun 04 12:00 PM -- 04:30 PM (PDT) @ 607 None
The 3rd Workshop on Synthetic Data for Computer Vision
Workshop
Thu Jun 04 12:15 PM -- 05:00 PM (PDT) @ 705/707 None
Second Workshop on Skilled Activity Understanding, Assessment & Feedback Generation
Workshop
Thu Jun 04 12:30 PM -- 04:30 PM (PDT) @ 712 None
The Third Workshop on Anomaly Detection with Foundation Models
Workshop
Thu Jun 04 12:30 PM -- 04:30 PM (PDT) @ Mile High 4AB None
Appearance Understanding and Generation
Workshop
Thu Jun 04 12:30 PM -- 03:30 PM (PDT) @ 502 None
Pixel-level Video Understanding in the Wild Challenge
Workshop
Thu Jun 04 12:30 PM -- 05:00 PM (PDT) @ 601 None
Visual Anomaly and Novelty Detection - 4th Edition
Workshop
Thu Jun 04 01:00 PM -- 04:40 PM (PDT) @ 108 None
See the World in a Different Light: Physical Appearance Modeling and Relighting in the Age of Generative AI
Workshop
Thu Jun 04 01:00 PM -- 04:30 PM (PDT) @ 704/706 None
6th International Workshop on Long-form Video Understanding, Generation and Action
Break
Thu Jun 04 02:00 PM -- 03:00 PM (PDT) @ ExHall A None
Coffee Break
Break
Fri Jun 05 06:00 AM -- 08:00 AM (PDT) @ ExHall C None
Breakfast
Registration
Fri Jun 05 06:00 AM -- 04:00 PM (PDT) @ Lobby A None
Registration / Badge Pickup
Remarks
Fri Jun 05 07:30 AM -- 08:00 AM (PDT) @ Bluebird Ballroom None
Welcome & Awards
Poster Setup
Fri Jun 05 07:45 AM -- 08:15 AM (PDT) @ ExHall A None
Poster Setup
Break
Fri Jun 05 08:00 AM -- 08:15 AM (PDT) None
Courtesy Break
Oral
Fri Jun 05 08:15 AM -- 08:27 AM (PDT) @ Four Seasons Ballroom None
Black-box Membership Inference Attacks on the Pre-training Data of Image-generation Models
Oral
Fri Jun 05 08:15 AM -- 08:30 AM (PDT) @ Bluebird Ballroom None
A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space
Oral
Fri Jun 05 08:15 AM -- 08:27 AM (PDT) @ Mile High Ballroom 1A - 2A None
Advancing Image Classification with Discrete Diffusion Classification Modeling
[
Slides]
Oral
Fri Jun 05 08:15 AM -- 08:27 AM (PDT) @ Mile High Ballroom 3A - 4A None
Customized Fusion: A Closed-Loop Dynamic Network for Adaptive Multi-Task-Aware Infrared-Visible Image Fusion
Oral Session
Fri Jun 05 08:15 AM -- 09:30 AM (PDT) @ Mile High Ballroom 3A - 4A None
Oral Session 1D: Computational Imaging
Oral Session
Fri Jun 05 08:15 AM -- 09:30 AM (PDT) @ Mile High Ballroom 1A - 2A None
Oral Session 1C: Efficient Reasoning
Oral Session
Fri Jun 05 08:15 AM -- 09:30 AM (PDT) @ Four Seasons Ballroom None
Oral Session 1B: Visual Security
Oral Session
Fri Jun 05 08:15 AM -- 09:30 AM (PDT) @ Bluebird Ballroom None
Oral Session 1A: Multimodal Vision
Oral
Fri Jun 05 08:27 AM -- 08:40 AM (PDT) @ Four Seasons Ballroom None
Data Leakage Detection and De-duplication in Large Scale Geospatial Image Datasets
Oral
Fri Jun 05 08:27 AM -- 08:40 AM (PDT) @ Mile High Ballroom 3A - 4A None
Dual Band Thermal Videography: Separating Time-Varying Reflection and Emission Near Ambient Conditions
Oral
Fri Jun 05 08:27 AM -- 08:40 AM (PDT) @ Mile High Ballroom 1A - 2A None
Does YOLO Really Need to See Every Training Image in Every Epoch?
Oral
Fri Jun 05 08:30 AM -- 08:45 AM (PDT) @ Bluebird Ballroom None
ANTS: Adaptive Negative Textual Space Shaping for OOD Detection via Test-Time MLLM Understanding and Reasoning
Oral
Fri Jun 05 08:40 AM -- 08:52 AM (PDT) @ Mile High Ballroom 3A - 4A None
MetaSpectra+: A Compact Broadband Metasurface Camera for Snapshot Hyperspectral+ Imaging
Oral
Fri Jun 05 08:40 AM -- 08:52 AM (PDT) @ Four Seasons Ballroom None
RAVEN: Erasing Invisible Watermarks via Novel View Synthesis
Oral
Fri Jun 05 08:40 AM -- 08:52 AM (PDT) @ Mile High Ballroom 1A - 2A None
Fine-grained Image Aesthetic Assessment: Learning Discriminative Scores from Relative Ranks
Oral
Fri Jun 05 08:45 AM -- 09:00 AM (PDT) @ Bluebird Ballroom None
ARGUS: Defending Against Multimodal Indirect Prompt Injection via Steering Instruction-Following Behavior
Oral
Fri Jun 05 08:52 AM -- 09:05 AM (PDT) @ Mile High Ballroom 1A - 2A None
NuWa: Deriving Lightweight Class-Specific Vision Transformers for Edge Devices
Oral
Fri Jun 05 08:52 AM -- 09:05 AM (PDT) @ Four Seasons Ballroom None
LDP-Slicing: Local Differential Privacy for Images via Randomized Bit-Plane Slicing
Oral
Fri Jun 05 08:52 AM -- 09:05 AM (PDT) @ Mile High Ballroom 3A - 4A None
Spectrum from Defocus: Fast Spectral Imaging with Chromatic Focal Stack
Oral
Fri Jun 05 09:00 AM -- 09:15 AM (PDT) @ Bluebird Ballroom None
TEAR: Temporal-aware Automated Red-teaming for Text-to-Video Models
Oral
Fri Jun 05 09:05 AM -- 09:17 AM (PDT) @ Mile High Ballroom 3A - 4A None
Towards Photorealistic and Efficient Bokeh Rendering via Diffusion Framework
Oral
Fri Jun 05 09:05 AM -- 09:17 AM (PDT) @ Mile High Ballroom 1A - 2A None
Plant Taxonomy Meets Plant Counting: A Fine-Grained, Taxonomic Dataset for Counting Hundreds of Plant Species
Oral
Fri Jun 05 09:05 AM -- 09:17 AM (PDT) @ Four Seasons Ballroom None
NOWA: Null-space Optical Watermark for Invisible Capture Fingerprinting and Tamper Localization
Poster Setup
Fri Jun 05 09:15 AM -- 09:45 AM (PDT) @ ExHall A None
Poster Setup
Oral
Fri Jun 05 09:15 AM -- 09:30 AM (PDT) @ Bluebird Ballroom None
ViT^3: Unlocking Test-Time Training in Vision
Oral
Fri Jun 05 09:17 AM -- 09:30 AM (PDT) @ Mile High Ballroom 1A - 2A None
Rethinking Dataset Distillation: Hard Truths about Soft Labels
Oral
Fri Jun 05 09:17 AM -- 09:30 AM (PDT) @ Mile High Ballroom 3A - 4A None
UnReflectAnything: RGB-Only Highlight Removal by Rendering Synthetic Specular Supervision
Oral
Fri Jun 05 09:17 AM -- 09:30 AM (PDT) @ Four Seasons Ballroom None
Revisiting Geometric Obfuscation with Dual Convergent Lines for Privacy-Preserving Image Queries in Visual Localization
Break
Fri Jun 05 09:45 AM -- 10:30 AM (PDT) @ ExHall F None
Coffee
Demonstration
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall F None
Demos
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 1
A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 2
Adversarial Style Optimization: Enhancing VLM Jailbreaks by GRPO-based Stylistic Triggers Optimization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 3
ANTS: Adaptive Negative Textual Space Shaping for OOD Detection via Test-Time MLLM Understanding and Reasoning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 4
ARGUS: Defending Against Multimodal Indirect Prompt Injection via Steering Instruction-Following Behavior
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 5
TEAR: Temporal-aware Automated Red-teaming for Text-to-Video Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 6
ViT^3: Unlocking Test-Time Training in Vision
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 7
Black-box Membership Inference Attacks on the Pre-training Data of Image-generation Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 8
Data Leakage Detection and De-duplication in Large Scale Geospatial Image Datasets
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 9
RAVEN: Erasing Invisible Watermarks via Novel View Synthesis
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 10
LDP-Slicing: Local Differential Privacy for Images via Randomized Bit-Plane Slicing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 11
NOWA: Null-space Optical Watermark for Invisible Capture Fingerprinting and Tamper Localization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 12
Revisiting Geometric Obfuscation with Dual Convergent Lines for Privacy-Preserving Image Queries in Visual Localization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 14
Does YOLO Really Need to See Every Training Image in Every Epoch?
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 15
Fine-grained Image Aesthetic Assessment: Learning Discriminative Scores from Relative Ranks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 16
NuWa: Deriving Lightweight Class-Specific Vision Transformers for Edge Devices
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 17
Plant Taxonomy Meets Plant Counting: A Fine-Grained, Taxonomic Dataset for Counting Hundreds of Plant Species
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 18
Rethinking Dataset Distillation: Hard Truths about Soft Labels
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 19
Customized Fusion: A Closed-Loop Dynamic Network for Adaptive Multi-Task-Aware Infrared-Visible Image Fusion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 20
Dual Band Thermal Videography: Separating Time-Varying Reflection and Emission Near Ambient Conditions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 21
MetaSpectra+: A Compact Broadband Metasurface Camera for Snapshot Hyperspectral+ Imaging
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 22
Spectrum from Defocus: Fast Spectral Imaging with Chromatic Focal Stack
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 23
Towards Photorealistic and Efficient Bokeh Rendering via Diffusion Framework
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 24
UnReflectAnything: RGB-Only Highlight Removal by Rendering Synthetic Specular Supervision
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 25
AVGGT: Rethinking Global Attention for Accelerating VGGT
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 26
ManifoldNeuS: Manifold-aware View Optimizability for Pose-Free Neural Surface Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 27
LongStream: Long-Sequence Streaming Autoregressive Visual Geometry
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 28
RPGFusion: 4D Radar Prior-Guided Multi-Modal Fusion for 3D Detection
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 29
MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 30
JRM: Joint Reconstruction Model for Multiple Objects without Alignment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 32
FreeScale: Scaling 3D Scenes via Certainty-Aware Free-View Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 33
Complet4R: Geometric Complete 4D Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 34
Unblur-SLAM: Dense Neural SLAM for Blurry Inputs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 35
Learning Compact 3D Representations from Feed-Forward Novel View Synthesis
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 36
Fast Spatial Tracking with Visual Geometry Transformer
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 37
How Much 3D Do Video Foundation Models Encode?
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 38
MetroGS: Efficient and Stable Reconstruction of Geometrically Accurate High-Fidelity Large-Scale Scenes
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 39
RnG: A Unified Transformer for Complete 3D Modeling from Partial Observations
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 40
Long-Tail Internet Photo Reconstruction
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 41
Emergent Outlier View Rejection in Visual Geometry Grounded Transformers
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 42
Flow3r: Factored Flow Prediction for Scalable Visual Geometry Learning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 43
MultiBanana: A Challenging Benchmark for Multi-Reference Text-to-Image Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 44
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 45
Design Your Ad: Personalized Advertising Image and Text Generation with Unified Autoregressive Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 47
ConsistCompose: Unified Multimodal Layout Control for Image Composition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 48
A Training-Free Style-Personalization via SVD-Based Feature Decomposition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 49
Beyond Patches: Global-aware Autoregressive Model for Multimodal Few-Shot Font Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 50
ImageRAGTurbo: Towards One-step Text-to-Image Generation with Retrieval-Augmented Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 51
OmniSonic: Towards Universal and Holistic Audio Generation from Video and Text
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 53
Curriculum Group Policy Optimization: Adaptive Sampling for Unleashing the Potential of Text-to-Image Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 54
SplitFlux: Learning to Decouple Content and Style from a Single Image
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 55
FontCrafter: High-Fidelity Element-Driven Artistic Font Creation with Visual In-Context Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 56
EmoStyle: Emotion-Driven Image Stylization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 57
Text-Image Conditioned 3D Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 58
IntroSVG: Learning from Rendering Feedback for Text-to-SVG Generation via an Introspective Generator–Critic Framework
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 59
AnyDoc: Enhancing Document Generation via Large-Scale HTML/CSS Data Synthesis and Height-Aware Reinforcement Optimization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 60
Reasoning Diffusion for Unpaired Test Time Out-of-distribution Text-Image to Video Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 61
SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 62
STAGE: Storyboard-Anchored Generation for Cinematic Multi-shot Narrative
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 63
MTA: Multimodal Task Alignment for BEV Perception and Captioning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 64
β-CLIP: Text-Conditioned Contrastive Learning for Multi-Granular Vision-Language Alignment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 65
SafeRoPE: Risk-specific Head-wise Embedding Rotation for Safe Generation in Rectified Flow Transformers
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 66
FALCON: False-Negative Aware Learning of Contrastive Negatives in Vision-Language Alignment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 67
Spatial-Aware VLA Pretraining through Visual-Physical Alignment from Human Videos
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 69
Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 70
EMO-R3: Reflective Reinforcement Learning for Emotional Reasoning in Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 71
EvoGraph-R1: Self-Evolving Multimodal Knowledge Hypergraphs for Agentic Retrieval
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 72
Cross-modal Identity Mapping: Minimizing Information Loss in Modality Conversion via Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 73
Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 74
Stabilizing Feature Geometry in Noisy Pretrained Models for Robust Downstream Tasks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 75
Black-Box Domain Adaptation for Object Detection with Retention-Driven Knowledge Compression
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 76
Decoupled and Reusable Adaptation for Efficient Cross-Modal Transfer
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 77
Preference-Aligned LoRA Merging: Preserving Subspace Coverage and Addressing Directional Anisotropy
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 78
Curvature-Aware Zeroth-Order Optimization for Memory-Efficient Test-Time Adaptation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 79
Label-Free Cross-Task LoRA Merging with Null-Space Compression
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 80
Basis-Oriented Low-rank Transfer for Few-Shot and Test-Time Adaptation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 81
GeCo: Geometry-Consistent Regularization for Domain Generalized Semantic Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 82
Event-based Motion Deblurring with Unpaired Data
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 83
Stable Spike: Dual Consistency Optimization via Bitwise AND Operations for Spiking Neural Networks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 84
Event-based Visual Deformation Measurement
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 85
Bidirectional Cross-Modal Prompting for Event-Frame Asymmetric Stereo
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 86
SpikeTrack: High-performance and Energy-efficient Event-Based Object Tracking with Spiking Neural Network
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 87
Event Structural Valley: A Unified Theoretical and Practical Framework for Event Camera Autofocus
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 89
Do You Have Freestyle? Expressive Humanoid Locomotion via Audio Control
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 90
CLaD: Planning with Grounded Foresight via Cross-Modal Latent Dynamics
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 91
InternData-A1: Pioneering High-Fidelity Synthetic Data for Pre-training Generalist Policy
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 92
DemoFunGrasp: Universal Dexterous Functional Grasping via Demonstration-Editing Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 93
GeniNav: Generative Model Driven Image-Goal Navigation via Imagination-Guided Consistency Flow Matching
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 94
Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 95
DRAMA: Next-Gen Dynamic Orchestration for Resilient Multi-Agent Ecosystems in Flux
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 96
Arcadia: Toward a Full-Lifecycle Framework for Embodied Lifelong Learning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 97
Wanderland: Geometrically Grounded Simulation for Open-World Embodied AI
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 98
ORV: 4D Occupancy-centric Robot Video Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 99
DextER: Language-driven Dexterous Grasp Generation with Embodied Reasoning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 100
Language-Free Generative Editing from One Visual Example
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 101
Omni IIE Bench: Benchmarking the Practical Capabilities of Image Editing Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 102
LuxRemix: Lighting Decomposition and Remixing for Indoor Scenes
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 103
CompBench: Benchmarking Complex Instruction-guided Image Editing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 104
Garments2Look: A Multi-Reference Dataset for High-Fidelity Outfit-Level Virtual Try-On with Clothing and Accessories
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 105
Learning Personalized Photographic Style from Pairwise User Preferences
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 106
CogniEdit: Dense Gradient Flow Optimization for Fine-Grained Image Editing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 108
MOSAIC-GS: Monocular Scene Reconstruction via Advanced Initialization for Complex Dynamic Environments
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 109
REArtGS++: Generalizable Articulation Reconstruction with Temporal Geometry Constraint via Planar Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 110
Dynamic-eDiTor: Training-Free Text-Driven 4D Scene Editing with Multimodal Diffusion Transformer
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 111
FaithFusion: Harmonizing Reconstruction and Generation via Pixel-wise Information Gain
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 112
IR-HGP: Physically-Aware Gaussian Inverse Rendering for High-Illumination Scenes via Generative Priors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 113
Seeing through boxes: Non-Line-of-Sight 3D Reconstruction from Radar Signals
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 114
Speeding Up the Learning of 3D Gaussians with Much Shorter Gaussian Lists
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 115
DynamicTree: Interactive Real Tree Animation via Sparse Voxel Spectrum
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 116
WildRayZer: Self-supervised Large View Synthesis in Dynamic Environments
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 117
DGGT: Feedforward 4D Reconstruction of Dynamic Driving Scenes using Unposed Images
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 118
Retrieve-to-Restore: Efficient All-in-One Image Restoration with a Retrieval-Based Degradation Bank
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 119
MRI Contrast Enhancement Kinetics World Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 120
ReflexSplit: Single Image Reflection Separation via Layer Fusion-Separation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 121
Rethinking Knowledge Transfer in Image Quality Assessment: A Perceptual Preference Structure Alignment Perspective
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 122
ZeroIDIR: Zero-Reference Illumination Degradation Image Restoration with Perturbed Consistency Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 123
White-Balance First, Adjust Later: Cross-Camera Color Constancy via Vision-Language Evaluation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 124
Unpaired Image Deraining Using Reward-Guided Self-Reinforcement Strategy
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 125
LF-BVN: Blind-View Network for Self-Supervised Light Field Denoising
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 128
Towards Generalized Representations for Low-Light Understanding: When Signal Constancy Meets Semantic Enrichment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 129
Synergistic Bleeding Region and Point Detection in Laparoscopic Surgical Videos
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 130
MedCLIPSeg: Probabilistic Vision-Language Adaptation for Data-Efficient and Generalizable Medical Image Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 131
AD-GBC: Anisotropic Granular-Ball Skip-Connection Refiner for UNet-Based Medical Image Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 132
OSA: Echocardiography Video Segmentation via Orthogonalized State Update and Anatomical Prior-aware Feature Enhancement
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 133
VesMamba: 3D Pulmonary Vessel Segmentation from CT images via Mamba with Structural Perception and Scale-aware Filtering
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 134
SemiGDA: Generative Dual-distribution Alignment for Semi-Supervised Medical Image Segmentation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 135
Diffusion-Based Native Adversarial Synthesis for Enhanced Medical Segmentation Generalization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 136
CG-Reasoner: Centroid-Guided Positional Reasoning Segmentation for Medical Imaging with a Robust Visual-Text Consistency Metric
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 137
Instruction-Guided Lesion Segmentation for Chest X-rays with Automatically Generated Large-Scale Dataset
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 138
Towards Highly Transferable Vision-Language Attack via Semantic-Augmented Dynamic Contrastive Interaction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 139
Towards Human-Imperceptible Backdoor Attacks on Text-to-Image Diffusion Models
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 140
TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 141
DualMirage: Hunting Stealthy Multimodal LLM Agents via CAPTCHAs with Contour and Adversarial Illusions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 142
Models as Lego Builders: Assembling Malice from Benign Blocks via Semantic Blueprints
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 143
Source Models Leak What They Shouldn’t: Unlearning Zero-Shot Transfer in Domain Adaptation Through Adversarial Optimization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 144
A Unified Perspective on Adversarial Membership Manipulation in Vision Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 145
Shedding Light on VLN Robustness: A Black-box Framework for Indoor Lighting-based Adversarial Attack
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 146
OddGridBench: Exposing the Lack of Fine-Grained Visual Discrepancy Sensitivity in Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 147
Beyond What's Shared: Recovering Lost Unique Information from Intermediate Layers to Boost Multimodal Geo-Foundation Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 148
WikiCLIP: An Efficient Contrastive Baseline for Open-domain Visual Entity Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 149
CLCR: Cross-Level Semantic Collaborative Representation for Multimodal Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 150
Learning Anchor in Dual Orthogonal Space for Fast Multi-view Clustering
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 153
FAVE: A Structured Benchmark for Fine-Grained Audio-Visual Temporal Evaluation in Multimodal LLMs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 154
Omni2Sound: Towards Unified Video-Text-to-Audio Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 155
EmoThinker: Advancing Visual-Acoustic Emotion Analysis via Structural Token Selection and Chain-of-Thought Reasoning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 156
Enhancing Descriptive Captions with Visual Attributes for Multimodal Perception
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 158
Vision-Speech Models: Teaching Speech Models to Converse about Images
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 159
EMMA: Extracting Multiple physical parameters from Multimodal Data
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 160
MMGait: Towards Multi-Modal Gait Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 161
OSMO: Open-vocabulary Self-eMOtion Tracking
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 162
MuCo: Multi-turn Contrastive Learning for Multimodal Embedding Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 164
Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 165
Active Perceptual Inference: A Corticothalamic-Inspired Dynamic Nested Recurrent Network for Multimodal Sentiment Analysis with Incomplete Data
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 166
Scalable Trajectory Generation for Whole-Body Mobile Manipulation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 167
Breaking the 3D Dataset Bottleneck: Fast Scalable Generation of Aligned 3D Assets from Scratch for Category 6D Pose Estimation and Robotic Grasping
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 168
Real-Time Multimodal Fingertip Contact Detection via Depth and Motion Fusion for Vision-Based Human–Computer Interaction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 169
Glove2Hand: Synthesizing Natural Hand-Object Interaction from Multi-Modal Sensing Gloves
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 170
UniDex: A Robot Foundation Suite for Universal Dexterous Hand Control from Egocentric Human Videos
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 171
ConsID-Gen: View-Consistent and Identity-Preserving Image-to-Video Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 172
DiverseGRPO: Mitigating Mode Collapse in Image Generation via Diversity-Aware GRPO
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 173
VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 174
Video Generation with Stable Transparency via Shiftable RGB-A Distribution Learner
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 175
MOFA-VTON: More Fashion Possibilities with Fine-Grained Adaptations in Virtual Try-On
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 176
Scaling Multi-Identity Consistency for Image Customization via Multi-to-Multi Matching Paradigm
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 177
NOVA: Sparse Control, Dense Synthesis for Pair-Free Video Editing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 178
Functional Mean Flow in Hilbert Space
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 179
Benchmarking Single-Factor Physical Video-to-Audio Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 180
UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 181
Refaçade: Editing Object with Given Reference Texture
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 182
Free-Lunch Long Video Generation via Layer-Adaptive O.O.D Correction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 183
Not All Birds Look The Same: Identity-Preserving Generation For Birds
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 184
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 185
EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 186
Clothe and Pose
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 187
FlowPortal: Residual-Corrected Flow for Training-Free Video Relighting and Background Replacement
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 188
The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 189
Rethinking UMM Visual Generation: Masked Modeling for Efficient Image-Only Pre-training
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 191
Bidirectional Normalizing Flow: From Data to Noise and Back
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 192
ShotDirector: Directorially Controllable Multi-Shot Video Generation with Cinematographic Transitions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 193
Are Image-to-Video Models Good Zero-Shot Image Editors?
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 194
FastLightGen: Fast and Light Video Generation with Fewer Steps and Parameters
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 195
Unified Latent Space for Understanding and Generation via Semantic Auto-encoder
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 196
AHS: Adaptive Head Synthesis via Synthetic Data Augmentations
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 197
CASR: A Robust Cyclic Framework for Arbitrary Large-Scale Super-Resolution with Distribution Alignment and Self-Similarity Awareness
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 198
Thermal Diffusion Matters: Infrared Spatial-Temporal Video Super-Resolution through Heat Conduction Priors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 199
TextOVSR: Text-Guided Real-World Opera Video Super-Resolution
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 200
VoDaSuRe: A Large-Scale Dataset Revealing Domain Shift in Volumetric Super-Resolution
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 201
GDPO-SR: Group Direct Preference Optimization for One-Step Generative Image Super-Resolution
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 202
Adaptive Anisotropic Gaussian Splatting for Multi-contrast MRI Arbitrary-Scale Super-Resolution with Anatomy Guidance
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 203
SignPR: A Progressive Vector-Quantized Diffusion Framework for Sign Language Production
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 204
LLaMo: Scaling Pretrained Language Models for Unified Motion Understanding and Generation with Continuous Autoregressive Tokens
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 205
FlashCap: Millisecond-Accurate Human Motion Capture via Flashing LEDs and Event-Based Vision
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 206
Geometric Neural Distance Fields for Learning Human Motion Priors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 207
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 208
Decoupled Generative Modeling for Human-Object Interaction Synthesis
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 209
LiveGesture: Streamable Co-Speech Gesture Generation Model
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 210
HandX: Scaling Bimanual Motion and Interaction Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 211
MaskAdapt: Learning Flexible Motion Adaptation via Mask-Invariant Prior for Physics-Based Characters
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 212
FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 213
ProjFlow: Projection Sampling with Flow Matching for Zero‑Shot Exact Spatial Motion Control
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 214
Correspondence-Attention Alignment for Multi-View Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 215
GenErase: Generalizable and Semantically-Aware Concept Erasure in Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 216
MatMart: Material Reconstruction of 3D Objects via Diffusion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 217
Region-Adaptive Sampling for Diffusion Transformers
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 218
Diffusion Guided Chain-of-Vision for Large Autoregressive Vision Models
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 219
Guiding Diffusion-based Reconstruction with Contrastive Signals for Balanced Visual Representation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 220
ConceptPrism: Concept Disentanglement in Personalized Diffusion Models via Residual Token Optimization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 221
Heterogeneous Decentralized Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 222
Refining Few-Step Text-to-Multiview Diffusion via Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 223
GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 224
ENC-Bench: A Benchmark for Evaluating Multimodal Large Language Models in Electronic Navigational Chart Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 226
RealBirdID: Benchmarking Bird Species Identification in the Era of MLLMs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 228
PP-OCRv5: A Specialized 5M-Parameter Model Rivaling Billion-Parameter Vision-Language Models on OCR Tasks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 229
World in a Frame: Understanding Culture Mixing as a New Challenge for Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 230
Gastric-X: A Multimodal Multi-Phase Benchmark Dataset for Advancing Vision-Language Models in Gastric Cancer Analysis
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 231
HiSpatial: Taming Hierarchical 3D Spatial Understanding in Vision-Language Models
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 232
HandVQA: Diagnosing and Improving Fine-Grained Spatial Reasoning about Hands in Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 233
Probing and Bridging Geometry–Interaction Cues for Affordance Reasoning in Vision Foundation Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 234
ARC Is a Vision Problem!
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 235
Geoint-R1: Formalizing Multimodal Geometric Reasoning with Dynamic Auxiliary Constructions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 236
S^2-MLLM: Boosting Spatial Reasoning Capability of MLLMs for 3D Visual Grounding with Structural Guidance
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 237
Learning Multi-View Spatial Reasoning from Cross-View Relations
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 238
Exploring Spatial Intelligence from a Generative Perspective
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 239
Physical Object Understanding with a Physically Controllable World Model
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 240
QueryMe: Query-Driven Open-Vocabulary 3D Object Affordances Grounding from Multimodal Evidence
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 241
Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 242
EG-3DVG: Expression and Geometry Aware Grounding Decoder for 3D Visual Grounding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 243
AffordMatcher: Affordance Learning in 3D Scenes from Visual Signifiers
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 244
SpatiaLQA: A Benchmark for Evaluating Spatial Logical Reasoning in Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 245
Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 246
Intra-class Distribution-guided Generative Hashing with Neighbor Refinement for Cross-modal Retrieval
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 247
Language-driven Fine-grained Retrieval
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 248
MRD: Multi-resolution Retrieval-Detection Fusion for High-Resolution Image Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 249
RetFormer: Multimodal Retrieval for Enhancing Image Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 250
DREAM: Document Recognition with Explicit Adaptive Memory
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 251
RMIR: A Benchmark Dataset for Reasoning-Intensive Multimodal Image Retrieval
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 252
POGA: Paraphrased and Oppositional Graph Alignment for Fine-Grained Cross-Modal Retrieval
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 253
Chain-of-Frames: Advancing Video Understanding in Multimodal LLMs via Frame-Aware Reasoning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 254
TempR1: Improving Temporal Understanding of MLLMs via Temporal-Aware Multi-Task Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 255
RiskProp: Collision-Anchored Self-Supervised Risk Propagation For Early Accident Anticipation
[
Slides]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 256
MotionEnhancer: Leveraging Video Diffusion for Motion-Enhanced Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 257
MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 258
Asynchronous Temporal Modeling with Two-Agent Framework for Streaming Dense Video Captioning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 259
TRCoRSurg: Temporal-Relational Co-Reasoning for Surgical Video Triplet Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 260
OASIS: On-Demand Hierarchical Event Memory for Streaming Video Reasoning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 261
One-Shot Flow, Any-Time Frame: A Bidirectional Warping Framework for Event-Based Video Frame Interpolation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 262
TF-CADE: Foreground-Concentrated Text-Video Alignment for Zero-Shot Temporal Action Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 263
PRISM: Prototype-based Reasoning with Inter-modal Semantic Mining for Interpretable Image Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 264
Concept Regions Matter: Benchmarking CLIP with a New Cluster-Importance Approach
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 265
PhaseWin Search Framework Enable Efficient Object-Level Interpretation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 266
Beyond Top Activations: Efficient and Reliable Crowdsourced Evaluation of Automated Interpretability
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 267
From Weights to Concepts: Data-Free Interpretability of CLIP via Singular Vector Decomposition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 268
Hierarchical Concept Embedding & Pursuit for Interpretable Image Classification
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 269
Interpretable and Steerable Concept Bottleneck Sparse Autoencoders
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 270
C-LaV: Conditional Latent Velocity Field Denoising for Weather-Robust LiDAR Place Recognition
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 271
Towards Foundation Models for 3D Scene Understanding: Instance-Aware Self-Supervised Learning for Point Clouds
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 272
Generalized-CVO: Fast and Correspondence-Free Local Point Cloud Registration with Second Order Riemannian Optimization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 273
LiDeRe: A Lightweight Readout for Fast and Data-Efficient Dense Prediction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 275
CoLC: Communication-Efficient Collaborative Perception with LiDAR Completion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 276
Spectral-Geometric Neural Fields for Pose-Free LiDAR View Synthesis
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 277
C-GenReg: Training-Free 3D Point Cloud Registration by Multi-View-Consistent Geometry-to-Image Generation with Probabilistic Modalities Fusion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 278
PatchAlign3D: Local Feature Alignment for Dense 3D Shape Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 279
FoV-Net: Rotation-Invariant CAD B-rep Learning via Field-of-View Ray Casting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 280
Neural Distribution Prior for LiDAR Out-of-Distribution Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 281
DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 282
Concept-Aware Batch Sampling Improves Language-Image Pretraining
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 283
HiFICL: High-Fidelity In-Context Learning for Multimodal Tasks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 284
InstAP: Instance-Aware Vision-Language Pre-Train for Spatial-Temporal Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 285
Vocabulary Scaling Law: Tuning Open-vocabulary Predictors for Their Openness
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 286
Render-to-Adapt: Unsupervised Personal Adaptation for Gaze Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 287
ViTPrompt: Training-Free Prompt Refinement with Visual Tokens for Open-Vocabulary Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 288
Cluster-Aware Neural Collapse Prompt Tuning for Long-Tailed Generalization of Vision-Language Models
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 289
LLMind: Bio-inspired Training-free Adaptive Visual Representations for Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 290
Dynamic Logits Adjustment and Exploration for Test-Time Adaptation in Vision Language Models
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 291
CAPT: Confusion-Aware Prompt Tuning for Reducing Vision-Language Misalignment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 292
GenMatter: Perceiving Physical Objects with Generative Matter Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 293
Bidirectional Query-Driven Generation of Parametric CAD Sketch
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 294
The Missing GAP: From Solving Square Jigsaw Puzzles to Handling Real World Archaeological Fragments
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 295
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 296
OmniDocLayout: Towards Diverse Document Layout Generation via Coarse-to-Fine LLM Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 297
Yo'City: Personalized and Boundless 3D Realistic City Scene Generation via Self-Critic Expansion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 298
Repurposing 3D Generative Model for Autoregressive Layout Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 299
CAD-Refiner: A Unified Framework for CAD Generation and Iterative Editing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 300
A Debiased Reconstruction-based Framework for Training-Free Detection of AI-Generated Images
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 301
Global Information Thresholding for Sufficient and Necessary Circuits
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 302
PrivateEyes: Gaze-Preserving Anonymization for Data Sharing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 303
From Measurement to Mitigation: Quantifying and Reducing Identity Leakage in Image Representation Encoders with Linear Subspace Removal
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 304
Bias In, Bias Out? Finding Unbiased Subnetworks in Vanilla Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 305
pH-Strips for Selective Forgetting: A Blunt but Fast Diagnostic Baseline for Machine Unlearning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 306
Decoupling Defense Strategies for Robust Image Watermarking
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 307
Unsafe2Safe: Controllable Image Anonymization for Downstream Utility
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 310
DP-FedAdamW: An Efficient Optimizer for Differentially Private Federated Large Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 311
Submodel Extraction for Efficient and Personalized Federated Learning via Optimal Transport
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 313
FedDAP: Domain-Aware Prototype Learning for Federated Learning under Domain Shift
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 314
FedAFD: Multimodal Federated Learning via Adversarial Fusion and Distillation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 315
VIRST: Video-Instructed Reasoning Assistant for SpatioTemporal Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 317
Stay in your Lane: Role Specific Queries with Overlap Suppression Loss for Dense Video Captioning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 318
T2SGrid: Temporal-to-Spatial Gridification for Video Temporal Grounding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 319
HanDyVQA: A Video QA Benchmark for Fine-Grained Hand-Object Interaction Dynamics
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 320
SAIL: Similarity-Aware Guidance and Inter-Caption Augmentation-based Learning for Weakly-Supervised Dense Video Captioning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 321
Token Warping Helps MLLMs Look from Nearby Viewpoints
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 322
Variation-aware Vision Token Dropping for Faster Large Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 323
Fine-Grained Post-Training Quantization for Large Vision Language Models with Quantization-Aware Integrated Gradients
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 324
Blink: Dynamic Visual Token Resolution for Enhanced Multimodal Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 325
IF-Prune: Information-Flow Guided Token Pruning for Efficient Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 326
EvoComp: Learning Visual Token Compression for Multimodal Large Language Models via Semantic-Guided Evolutionary Labeling
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 327
DocPrune: Efficient Document Question Answering via Background, Question, and Comprehension-aware Token Pruning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 328
QuietPrune: Query-Guided Early Token Pruning for Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 329
The Devil Is in Gradient Entanglement: Energy-Aware Gradient Coordinator for Robust Generalized Category Discovery
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 330
LLM-Guided Probabilistic Fusion for Label-Efficient Document Layout Analysis
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 331
Coordinate Denoising for Non‑Equilibrium Molecular Representation Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 332
Plug-and-Play Incomplete Multi-View Clustering via Janus-Faced Affinity Learning with Topology Harmonization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 333
Meta-Learning In-Context Enables Training-Free Cross Subject Brain Decoding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 334
Measure The Feature Universe: Topology-based Pseudo Labeling and Gravity Consistency for Source-Free Domain Adaptation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 335
Conditional Factuality Controlled LLMs with Generalization Certificates via Conformal Sampling
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 336
Harnessing the Power of Foundation Models for Accurate Material Classification
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 337
Content-Aware Frequency Encoding for Implicit Neural Representations with Fourier-Chebyshev Features
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 338
ActiveAD: Planning-Oriented Active Learning for End-to-End Autonomous Driving
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 339
TeFlow: Enabling Multi-frame Supervision for Self-Supervised Feed-forward Scene Flow Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 340
Think Before You Drive: World Model-Inspired Multimodal Grounding
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 342
DrivePTS: A Progressive Learning Framework with Textual and Structural Enhancement for Driving Scene Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 343
WOD-E2E: Waymo Open Dataset for End-to-End Driving in Challenging Long-tail Scenarios
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 344
GuideFlow: Constraint-Guided Flow Matching for Planning in End-to-End Autonomous Driving
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 345
ResAD: Normalized Residual Trajectory Modeling for End-to-End Autonomous Driving
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 346
KnowVal: A Knowledge-Augmented and Value-Guided Autonomous Driving System
[
Slides]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 347
FoSS: Modeling Long-Range Dependencies and Multimodal Uncertainty in Trajectory Prediction via Fourier–State Space Integration
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 348
NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow Networks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 349
Visual Prototype Conditioned Focal Region Generation for UAV-Based Object Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 350
Consistent Instance Field for Dynamic Scene Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 351
CLP: A Real-World Dataset of Contaminated Lens Protectors for Robust Semantic Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 353
Heuristic Self-Paced Learning for Domain Adaptive Semantic Segmentation under Adverse Conditions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 354
SAM2Text: Towards Prompt-Free and Multi-Resolution Video Scene Text Segmentation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 355
Reinforcing Video Reasoning Segmentation to Think Before It Segments
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 356
VideoMaMa: Mask-Guided Video Matting via Generative Prior
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 357
Quantized Residuals to Continuous Prompts for Few-Shot Class Incremental Learning in Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 358
The Golden Subspace: Where Efficiency Meets Generalization in Continual Test-Time Adaptation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 359
SAIDO: Generalizable Detection of AI-Generated Images via Scene-Aware and Importance-Guided Dynamic Optimization in Continual Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 360
Is Parameter Isolation Better for Prompt-Based Continual Learning?
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 361
Octopus: History-Free Gradient Orthogonalization for Continual Learning in Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 362
Affordance-First Decomposition for Continual Learning in Video–Language Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 363
Quantum-Gated Task-interaction Knowledge Distillation for Pre-trained Model-based Class-Incremental Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 364
Elastic Weight Consolidation Done Right for Continual Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 365
On Token's Dilemma: Dynamic MoE with Drift-Aware Token Assignment for Continual Learning of Large Vision Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 366
Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal Animation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 367
Talking Together: Synthesizing Co-Located 3D Conversations from Audio
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 368
InfinityHuman: Towards Long-Term Audio-Driven Human Animation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 369
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 370
AudioAvatar: Personalized Audio-driven Whole-body Talking Avatars
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 371
One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 372
Counterfactual VLA: Self-Reflective Vision-Language-Action Model with Adaptive Reasoning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 373
SGDrive: Scene-to-Goal Hierarchical World Cognition for Autonomous Driving
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 374
CapNav: Benchmarking Vision Language Models on Capability-conditioned Indoor Navigation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 375
AutoTraces: Autoregressive Trajectory Forecasting via Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 376
AwareVLN: Reasoning with Self-awareness for Vision-Language Navigation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 377
Progress-Think: Semantic Progress Reasoning for Vision-Language Navigation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 378
Tavatar: Topology-Aware Gaussian Attribute Derivation for Animatable Human Avatars
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 379
PercHead: Perceptual Head Model for Single-Image 3D Head Reconstruction & Editing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 380
PhysHead: Simulation-Ready Gaussian Head Avatars
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 381
ReWeaver: Towards Simulation-Ready and Topology-Accurate Garment Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 382
FHAvatar: Fast and High-Fidelity Reconstruction of Face-and-Hair Composable 3D Head Avatar from Few Casual Captures
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 383
Feed-Forward One-Shot Animatable Textured Mesh Avatar Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 384
Reallocating Attention Across Layers to Reduce Multimodal Hallucination
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 385
VES-RFT: Rewarding Visual Evidence Sensitivity to Mitigate Hallucinations in Large Vision–Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 386
Fighting Hallucinations with Counterfactuals: Diffusion-Guided Perturbations for LVLM Hallucination Suppression
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 387
Unstitching the Chimera: Frame-Level Risk and Train-Free Mitigation for Video Hallucination
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 389
Breaking the Illusion: When Positive Meets Negative in Multimodal Decoding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 390
FlexTraj: Image-to-Video Generation with Flexible Point Trajectory Control
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 391
Diff4Splat: Repurposing Video Diffusion Models for Dynamic Scene Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 392
Spatia: Video Generation with Updatable Spatial Memory
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 393
Geometry-as-context: Modulating Explicit 3D in Scene-consistent Video Generation to Geometry Context
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 394
EgoControl: Controllable Egocentric Video Generation via 3D Full-Body Poses
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 395
CustomTex: High-fidelity Indoor Scene Texturing via Multi-Reference Customization
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 396
FoleyDesigner: Immersive Stereo Foley Generation with Precise Spatio-Temporal Alignment for Film Clips
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 397
Physical Simulator In-the-Loop Video Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 398
Refracting Reality: Generating Images with Realistic Transparent Objects
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 399
Generating Humanless Environment Walkthroughs from Egocentric Walking Tour Videos
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 400
EgoFlow: Gradient-Guided Flow Matching for Egocentric 6DoF Object Motion Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 401
Spatial-Frequency Collaborative Learning for Occluded Visible-Infrared Person Re-Identification
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 402
Mind the Gap: Transferring Labels to Align Object Detection Datasets
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 403
SSM-Aware Token-Efficient VMamba via Adaptive Patch Pruning and Merging for Person Re-Identification
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 404
Tri-Modal Fusion Transformers for UAV-based Object Detection
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 405
View-Aware Semantic Alignment for Aerial-Ground Person Re-Identification
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 406
RHCNet: Residual-Guided Hierarchical Calibration Network for Robust Underwater Object Detection
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 407
X-AVDT: Audio-Visual Cross-Attention for Robust Deepfake Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 408
Beyond Duality: A Hybrid Framework of Leveraging Shared and Private Features for RGB-Event Object Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 409
FVBench: Benchmarking Deepfake Video Detection Capability of Large Multimodal Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 410
AKCMamba-YOLO: Selective State Space Models For Real-Time Object Detection
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 411
When AVSR Meets Video Conferencing: Dataset, Degradation, and the Hidden Mechanism Behind Performance Collapse
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 412
Your One-Stop Solution for AI-Generated Video Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 413
UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 414
Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 415
HumanVBench: Probing Human-Centric Video Understanding in MLLMs with Automatically Synthesized Benchmarks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 416
HERBench: A Benchmark for Multi-Evidence Integration in Video Question Answering
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 417
Seeing the Scene Matters: Revealing Forgetting in Video Understanding Models with a Scene-Aware Long-Video Benchmark
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 418
Thinking with Frames: Generative Video Distortion Evaluation via Frame Reward Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 419
MovieRecapsQA: A Multimodal Open-Ended Video Question-Answering Benchmark
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 420
Training-free, Perceptually Consistent Low-Resolution Previews with High-Resolution Image for Efficient Workflows of Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 421
One Model, Many Budgets: Elastic Latent Interfaces for Diffusion Transformers
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 422
Reflection Separation from a Single Image via Joint Latent Diffusion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 423
MMFace-DiT: A Dual-Stream Diffusion Transformer for High-Fidelity Multimodal Face Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 424
DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 426
VMonarch: Efficient Video Diffusion Transformers with Structured Attention
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 427
DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 428
Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 429
Transition Matching Distillation for Fast Video Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 430
Diffusion-Based Makeup Transfer with Facial Region-Aware Makeup Features
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 431
UniPR: Unified Object-level Real-to-Sim Perception and Reconstruction from a Single Stereo Pair
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 432
Query2Uncertainty: Robust Uncertainty Quantification and Calibration for 3D Object Detection under Distribution Shift
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 433
DICArt: Advancing Category-level Articulated Object Pose Estimation in Discrete State-Spaces
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 434
PoseGaussian: 6D Pose Estimation for Unseen Objects via Sparse-View Object-Level 3D Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 435
VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 436
MonoSAOD: Monocular 3D Object Detection with Sparsely Annotated Label
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 437
V2U4Real: A Real-world Large-scale Dataset for Vehicle-to-UAV Cooperative Perception
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 438
SketchVL: Policy Optimization via Fine-Grained Credit Assignment for Chart Understanding and More
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 439
A Causal Marriage between VLM and IRM from Understanding to Reasoning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 440
Why Does RL Generalize Better Than SFT? A Data-Centric Perspective on VLM Post-Training
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 441
SoC: Semantic Orthogonal Calibration for Test-Time Prompt Tuning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 442
Learning to Select Visual Tools from Experience
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 443
Agile Deliberation: Concept Deliberation for Subjective Visual Classification
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 444
Tea-Adapter: Teacher Adapter for Efficient Conditional Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 445
From Failure to Feedback: Group Revision Unlocks Hard Cases in Object-Level Grounding
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 446
Perception Characteristics Distance: Measuring Stability and Robustness of Perception System in Dynamic Conditions under a Certain Decision Rule
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 447
FinPercep-RM: A Fine-grained Reward Model and Co-evolutionary Curriculum for RL-based Real-world Super-Resolution
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 448
Twin-T & TwintVQA: A Reliable Structure–Detail Separating VLM and a Comprehensive Benchmark for Chart and Table Tasks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 449
SDGS: Spatial Difference Guided Gaussian Splatting for Simultaneous Localization and 3D Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 450
RT-Splatting: Joint Reflection-Transmission Modeling with Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 451
Pose-Free Omnidirectional Gaussian Splatting for 360-Degree Videos with Consistent Depth Priors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 452
Distilling Unsigned Distance Function for Surface Reconstruction from 3D Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 453
Exact-GS: Mathematically Rigorous and Accurate 3D Gaussian Splatting for 3D X-ray Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 454
DualSplat: Robust 3D Gaussian Splatting via Pseudo-Mask Bootstrapping from Reconstruction Failures
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 455
E2EGS: Event-to-Edge Gaussian Splatting for Pose-Free 3D Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 456
Neural Gabor Splatting: Enhanced Gaussian Splatting with Neural Gabor for High-frequency Surface Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 457
DirectFisheye-GS: Enabling Native Fisheye Input in Gaussian Splatting with Cross-View Joint Optimization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 458
VAD-GS: Visibility-Aware Densification for 3D Gaussian Splatting in Dynamic Urban Scenes
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 459
GauMVC: Generative Decoupled Gaussian Representation for Human-centric Multi-view Video Compression
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 460
A Geometric Algebra-Informed 3DGS Framework for Wireless Channel Prediction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 461
RaGS: Unleashing 3D Gaussian Splatting from 4D Radar and Monocular Cue for 3D Object Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 462
Cross-Instance Gaussian Splatting Registration via Geometry-Aware Feature-Guided Alignment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 463
ActivePolicy: Active Gaussian Reconstruction and Optimization Strategy Based on Global-Local Information Gain
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 464
Uncertainty-driven 3D Gaussian Splatting Active Mapping via Anisotropic Visibility Field
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 465
SV-GS: Sparse View 4D Reconstruction with Skeleton-Driven Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 466
NimbusGS: Unified 3D Scene Reconstruction under Hybrid Weather
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 467
SparseSplat: Towards Applicable Feed-Forward 3D Gaussian Splatting with Pixel-Unaligned Prediction
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 468
REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 469
Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 470
Unlocking Token Rewards via Training-Free Reward Attribution
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 472
When to Think and When to Look: Uncertainty-Guided Lookback
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 473
StaR-KVQA: Structured Reasoning Traces for Implicit-Knowledge Visual Question Answering
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 474
Understanding Counting Mechanisms in Large Language and Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 475
CLiViS: Unleashing Cognitive Map through Linguistic-Visual Synergy for Embodied Visual Reasoning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 476
Proof-of-Perception: Certified Tool-Using Multimodal Reasoning with Compositional Conformal Guarantees
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 479
Hugging Visual Prompt and Segmentation Tokens: Consistency Learning for Fine-Grained Visual Understanding in MLLMs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 480
VisionLeaf: Entropy-Guided Leaf-First Reasoning for Efficient and Accurate Think-with-Image
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 481
GGBench: A Geometric Generative Reasoning Benchmark for Unified Multimodal Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 482
Beyond Depth: Evaluating the Width-centric Reasoning Capability of MLLMs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 483
GenSplat: Bridging the Generalization Gap in 3DGS Language Comprehension
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 484
CC-VQA: Conflict- and Correlation-Aware Method for Mitigating Knowledge Conflict in Knowledge-Based Visual Question Answering
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 485
LoPrune: Efficient Data Pruning for LoRA-Based Fine-Tuning of Vision Transformer
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 488
RADAR: VQ-VAE Decoder of VAR is a Good Student for Restoring Against Degradation by Acceleration
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 489
Beyond Single Solution: Multi-Hypothesis Deep Unfolding Network for Image Compressive Sensing
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 490
FlashDecoder: Real-Time Latent-to-Pixel Streaming Decoder with Transformers
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 491
MambaSIC: Mamba-based Stereo Image Compression with Bi-directional Multi-reference Entropy Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 492
Neural Dynamic GI: Random-Access Neural Compression for Temporal Lightmaps in Dynamic Lighting Environments
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 493
Discovering Adaptive Task Dependencies for Efficient Multi-Task Representation Compression
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 494
OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 495
Perceptual Neural Video Compression with Color Separation and Rank Chain
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 496
Beyond Matching to Tiles: Bridging Unaligned Aerial and Satellite Views for Vision-Only UAV Navigation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 498
PiLoT: Neural Pixel-to-3D Registration for UAV-based Ego and Target Geo-localization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 499
PAUL: Uncertainty-Guided Partition and Augmentation for Robust Cross-View Geo-Localization under Noisy Correspondence
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 500
UniGeoRS: A Unified Benchmark for Tri-view Geo-Localization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 502
Watch and Learn: Learning to Use Computers from Online Videos
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 503
OneThinker: All-in-one Reasoning Model for Image and Video
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 504
Incentivizing Versatile Video Reasoning in MLLMs via Data-Efficient Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 505
Act2See: Emergent Active Visual Perception for Video Reasoning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 506
VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 507
ViLoMem: Agentic Learner with Grow-and-Refine Multimodal Semantic Memory
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 508
ReMoT: Reinforcement Learning with Motion Contrast Triplets
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 509
Incentivizing Generative Zero-Shot Learning via Outcome-Reward Reinforcement Learning with Visual Cues
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 510
Semantic-Guided Global-Local Collaborative Prompt Learning for Few-Shot Class Incremental Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 511
Beyond Heuristic Prompting: A Concept-Guided Bayesian Framework for Zero-Shot Image Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 512
One Patch to Caption Them All: A Unified Zero-Shot Captioning Framework
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 513
Data-Centric Meta-Learning for Robust Few-Shot Generalization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 514
Bridging the Modality Gap in Compositional Zero-Shot Learning via Sparse Alignment and Unimodal Memory Bank
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 515
LIFT and PLACE: A Simple, Stable, and Effective Knowledge Distillation Framework for Lightweight Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 517
Uncertainty-Aware Knowledge Distillation for Multimodal Large Language Models
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 518
Beyond Soft Label: Dataset Distillation via Orthogonal Gradient Matching
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 519
BHCast: Unlocking Black Hole Plasma Dynamics from a Single Blurry Image with Long-Term Forecasting
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 520
RawMetaDiff: Unlocking Extreme Darkness from Dual-Exposure RAW with Meta-Guided Diffusion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 521
Prospective Dynamic 3D MRI Reconstruction via Latent-Space Motion Tracking from Single Measurement
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 522
Lens Component Deletion based on Differentiable Ray Tracing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 523
X-band Radar Non-Line-of-Sight Imaging
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 524
3M-TI: High-Quality Mobile Thermal Imaging via Calibration-free Multi-Camera Cross-Modal Diffusion
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 525
UAVLight: A Benchmark for Illumination-Robust 3D Reconstruction in Unmanned Aerial Vehicle (UAV) Scenes
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 526
Polarization State Tracing for Reflection Removal and Color-Consistent Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 527
GFRRN: Explore the Gaps in Single Image Reflection Removal
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 528
Efficient All-Pairs Correlation Volume Sampling for Optical Flow Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 529
Cross-Slice Knowledge Transfer via Masked Multi-Modal Heterogeneous Graph Contrastive Learning for Spatial Gene Expression Inference
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 530
Adapting a Pre-trained Single-Cell Foundation Model to Spatial Gene Expression Generation from Histology Images
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 532
SO(3)-Equivariant ViT-Adapter for Data-Efficient Zero-Shot Sim-to-Real Indoor Panoramic Depth Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 533
Sparsity-Aware Voxel Attention and Foreground Modulation for 3D Semantic Scene Completion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 534
XPaintNet: An eXtreme Lightweight Framework for Stereoscopic Conversion without Inpainting Network
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 535
MD2E: Modeling Depth-to-Edge Cues for Monocular Metric Depth Estimation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 536
LiteSense: Lifting Lightweight ToF with RGB for High-Resolution Metric Depth Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 537
3D-Aware Multi-Task Learning with Cross-View Correlations for Dense Scene Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 538
The Midas Touch for Metric Depth
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 539
Lifting Unlabeled Internet-level Data for 3D Scene Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 540
ObjectMorpher: 3D-Aware Image Editing via Deformable 3DGS
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 542
MeshFlow: Efficient Artistic Mesh Generation via MeshVAE and Flow-based Diffusion Transformer
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 543
WonderZoom: Multi-Scale 3D World Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 544
SceneTok: A Compressed, Diffusable Token Space for 3D Scenes
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 545
PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 546
Extend3D: Town-Scale 3D Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 547
Pano3DComposer: Feed-Forward Compositional 3D Scene Generation from Single Panoramic Image
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 548
MeshWeaver: Sparse-Voxel-Guided Surface Weaving for Autoregressive Mesh Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 549
CaliTex: Geometry-Calibrated Attention for View-Coherent 3D Texture Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 550
CraftMesh: High-Fidelity Generative Mesh Manipulation via Poisson Seamless Fusion
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 551
LoG3D: Ultra-High-Resolution 3D Shape Modeling via Local-to-Global Partitioning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 552
MaskFocus: Focusing Policy Optimization on Critical Steps for Masked Image Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 553
Efficient Training for Human Video Generation with Entropy-Guided Prioritized Progressive Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 554
PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 555
GRPO-Guard: Mitigating Implicit Over-Optimization in Flow Matching via Regulated Clipping
[
Slides]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 556
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 557
Flash-DMD: Towards High-Fidelity Few-Step Image Generation with Efficient Distillation and Joint Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 558
VISTA: A Test-Time Self-Improving Video Generation Agent
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 559
Neighbor GRPO: Contrastive ODE Policy Optimization Aligns Flow Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 560
SMV-EAR: Bring Spatiotemporal Multi-View Representation Learning into Efficient Event-Based Action Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 561
Hierarchical Action Learning for Weakly-Supervised Action Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 562
Gamba: Mamba-based graph convolutional network with dynamic graph topology learning for action recognition
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 563
Beyond Binary Contrast: Modeling Continuous Skeleton Action Spaces with Transitional Anchors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 564
PRISM: Learning a Shared Primitive Space for Transferable Skeleton Action Representation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 565
TWEO: Transformers Without Extreme Outliers Enables FP8 Training And Quantization For Dummies
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 566
Unified Spherical Frontend: Learning Rotation-Equivariant Representations of Spherical Images from Any Camera
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 569
Towards Knowledge-augmented Bayesian Deep Learning For Computer Vision
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 570
NESTOR: A Nested MOE-based Neural Operator for Large-Scale PDE Pre-Training
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 571
Evidential Transformation Network: Turning Pretrained Models into Evidential Models for Post-hoc Uncertainty Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 572
Beyond Euclidean Gossip: KL-Barycentric Consensus on Heterogeneous and Imbalanced Images
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 574
Batch Loss Score for Dynamic Data Pruning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 575
Teacher-Guided Routing for Sparse Vision Mixture-of-Experts
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 576
WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 577
MangoBench: A Benchmark for Multi-Agent Goal-Conditioned Offline Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 578
iSHIFT: Lightweight Slow-Fast GUI Agent with Adaptive Perception
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 579
MMBench-GUI: A Unified Hierarchical Evaluation Framework for Multi-Platform GUI Agents
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 580
Boosting Vision-Language Models Towards Cross-Domain Incremental Object Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 581
UniSpector: Towards Universal Open-set Defect Recognition via Spectral-Contrastive Visual Prompting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 582
Unlearning without Forgetting: Securely Removing Targeted Concepts from Large-Scale Vision-Language Open-Vocabulary Detectors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 583
UNI-OOD: Unified Object- and Image-level Out-of-Distribution Detection via Cross-Context Attentive Vision-Language Modeling
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 584
S2C2Seg: Semantic-Spatial Consistency and Category Optimization for Open-Vocabulary Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 585
NoOVD: Novel Category Discovery and Embedding for Open-Vocabulary Object Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 586
The Missing Point in Vision Transformers for Universal Image Segmentation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 587
PromptMoE: A Segmentation Refinement Framework Leveraging Mixture of Experts for Improved Prompting
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 588
The Power of Prior: Training-Free Open-Vocabulary Semantic Segmentation with LLaVA
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 589
Beyond Text: Visual Description Assembly by Probabilistic Model for CLIP-based Weakly Supervised Semantic Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 590
High-Precision Dichotomous Image Segmentation via Depth Integrity-Prior and Fine-Grained Patch Strategy
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 591
GeoSAM2: Unleashing the Power of SAM2 for 3D Part Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 592
Material Magic Wand: Material-Aware Grouping of 3D Parts in Untextured Meshes
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 593
Synthetic Object Compositions for Scalable and Accurate Learning in Detection, Segmentation, and Grounding
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 594
Unlocking 3D Affordance Segmentation with 2D Semantic Knowledge
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 595
HySeg: Learning Generative Priors for Structure-Aware Remote Sensing Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 597
MMVIP: A Visible-infrared Paired Dataset for Multi-weather Marine Vision
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 598
Beyond Tie Points: Satellite Image Block Adjustment based on Dense Feature Consistency
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 599
Spectrally Distilled Representations Aligned with Instruction-Augmented LLMs for Satellite Imagery
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 600
Global Underwater Geolocation from Time-Lapse Polarization Imagery
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 601
Olbedo: An Albedo and Shading Aerial Dataset for Large-Scale Outdoor Environments
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 602
PRUE: A Practical Recipe for Field Boundary Segmentation at Scale
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 603
SARMAE: Masked Autoencoder for SAR Representation Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 604
LNEM: Lunar Neural Elevation Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 605
A Polarized Reflection and Material Dataset of Real World Objects
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 606
LaSM: Layer-wise Scaling Mechanism for Defending Pop-up Attack on GUI Agents
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 607
RaPA: Enhancing Transferable Targeted Attacks via Random Parameter Pruning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 608
All Vehicles Can Lie: Efficient Adversarial Defense in Fully Untrusted-Vehicle Collaborative Perception via Pseudo-Random Bayesian Inference
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 609
A Combination of Noise and Bilateral Filters Achieve Supralinear and Scalable Adversarial Robustness in CNNs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 611
Write Where It Matters: Policy-Guided Watermarks for 3D Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 612
Attack for Defense: Adversarial Agents for Point Prompt Optimization Empowering Segment Anything Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 613
RevINN: An End-to-End Invertible Neural Network for Reversible Adversarial Examples Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 614
CamPI: Physical Adversarial Examples through Camera Power Signal Injection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 616
GraspALL: Adaptive Structural Compensation from Illumination Variation for Robotic Garment Grasping in Any Low-Light Conditions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 617
Opening the Sim-to-Real Door for Humanoid Pixel-to-Action Policy Transfer
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 618
Learning Cross-View Object Correspondence via Cycle-Consistent Mask Prediction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 619
RoboWheel: A Data Engine from Real-World Human Demonstrations for Cross-Embodiment Robotic Learning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 620
Chain of World: World Model Thinking in Latent Motion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 621
Scalable Feature Matching via State Space Modeling and Sparse Correlation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 622
Video2Robo: 3DGS-based Synthetic Data from One Video Enables Scalable Robot Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 623
ConsisVLA-4D: Advancing Spatiotemporal Consistency in Efficient 3D-Perception and 4D-Reasoning for Robotic Manipulation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 624
SRPO: Self-Referential Policy Optimization for Vision-Language-Action Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 625
GeoDexGrasp: Geometry-aware Generation for Data-efficient and Physics-plausible Dexterous Grasping
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 626
Lifelong Imitation Learning with Multimodal Latent Replay and Incremental Adjustment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 627
From Observation to Action: Latent Action-based Primitive Segmentation for VLA Pre-training in Industrial Settings
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 628
AGiLe: Learning Robust Long-Horizon Manipulation via Affordance-Grounded Bidirectional Latent Planning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 629
Language-Grounded Decoupled Action Representation for Robotic Manipulation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 630
Learning to Act Robustly with View-Invariant Latent Actions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 631
ORBIT: Benchmarking SfM in the Wild with 360° Video
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 632
SpikeTrack: A Spike-driven Framework for Efficient Visual Tracking
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 633
Time Without Time: Pseudo-Temporal Representation for Space-Time Super-Resolution
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 634
Envisioning the Future, One Step at a Time
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 635
FlowFM: Advancing Dark Optical Flow Estimation with Flow Matching
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 636
Drift-Resilient Temporal Priors for Visual Tracking
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 637
An Efficient Token Compression Framework for Visual Object Tracking
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 638
No Labels, No Look-Ahead: Unsupervised Online Video Stabilization with Classical Priors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 639
From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 640
Momentum Memory for Knowledge Distillation in Computational Pathology
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 641
Modeling the Brain’s Grammar: ROI-Guided fMRI Pretraining for Transferable and Interpretable Vision Decoding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 642
Joint Spectral Image Reconstruction and Semantic Segmentation with Cooperative Unfolding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 643
X-WIN: Building Chest Radiograph World Model via Predictive Sensing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 644
fMRI-LM: Towards a Universal Foundation Model for Language-Aligned fMRI Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 645
Tell2Adapt: A Unified Framework for Source Free Unsupervised Domain Adaptation via Vision Foundation Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 646
TIM: Temporal Decoupling with Iterative Mutual-Refinement Model for Longitudinal Radiology Report Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 647
Ultrasound-CLIP: Semantic-Aware Contrastive Pre-training for Ultrasound Image-Text Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 649
BiGMINT: Biologically-guided Hierarchical Multimodal Integration for Modeling Multiple Compound Activities in Drug Discovery
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 650
Modeling Spatiotemporal Neural Frames for High Resolution Brain Dynamic
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 651
CMR-RD: Long-Tailed Adaptive VLM for Explainable CMR Diagnosis
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 652
Clinically-Grounded Counterfactual Reasoning for Medical Video Diagnosis
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 653
FBTA: Enabling Single-GPU End-to-End Gigapixel WSI Classification with Feature Bridging and Translation Alignment
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 654
Ultra Diffusion Poser: Diffusion-Based Human Motion Tracking from Sparse Inertial Sensors and Ranging-based Between-sensor Distances
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 655
Egocentric Visibility-Aware Human Pose Estimation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 656
Shoe Style-Invariant and Ground-Aware Learning for Dense Foot Contact Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 657
OMG-Bench: A New Challenging Benchmark for Skeleton-based Online Micro Hand Gesture Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 658
Recovering Physically Plausible Human-Object Interactions from Monocular Videos
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 659
MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 660
TeHOR: Text-Guided 3D Human and Object Reconstruction with Textures
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 661
SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 662
CrossHOI: Learning Cross-View Representations for Monocular 3D Human-Object Interaction Reconstruction
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 663
Gaussian-Mixture Latent Flow for Stochastic 3D Human Motion Prediction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 664
SGSoft: Learning Fused Semantic-Geometric Features for 3D Shape Correspondence via Template-Guided Soft Signals
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 665
Beyond Single-View Sufficiency: CVBench for Cross-View Human Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 666
Breaking Spurious Correlations: Uncertainty-Driven Causal Transformers for AU Detection
[
Poster]
Poster Session
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F None
Poster Session 1 & Exhibit Hall
Art Program
Fri Jun 05 09:45 AM -- 05:00 PM (PDT) @ ExHall F None
Art Exhibition
Art Program
Fri Jun 05 10:00 AM -- 10:30 AM (PDT) @ ExHall F None
Art Gallery Tour with Curator and Artists
Oral
Fri Jun 05 12:00 PM -- 12:12 PM (PDT) @ Mile High Ballroom 3A - 4A None
4D Primitive-Mâché: Glueing Primitives for Persistent 4D Scene Reconstruction
Oral
Fri Jun 05 12:00 PM -- 12:12 PM (PDT) @ Four Seasons Ballroom None
3DReflecNet: A Large-Scale Dataset for 3D Reconstruction of Reflective, Transparent, and Low-Texture Objects
Oral
Fri Jun 05 12:00 PM -- 12:12 PM (PDT) @ Bluebird Ballroom None
MAMMA: Markerless Accurate Multi-person Motion Acquisition
Oral
Fri Jun 05 12:00 PM -- 12:12 PM (PDT) @ Mile High Ballroom 1A - 2A None
Energy-GS: Image Energy-guided Pose Alignment Gaussian Splatting with redesigned pose gradient flow
Oral Session
Fri Jun 05 12:00 PM -- 01:15 PM (PDT) @ Bluebird Ballroom None
Oral Session 2A: 3D Reconstruction
Oral Session
Fri Jun 05 12:00 PM -- 01:15 PM (PDT) @ Mile High Ballroom 3A - 4A None
Oral Session 2D: Spatio-Temporal Reconstruction
Oral Session
Fri Jun 05 12:00 PM -- 01:15 PM (PDT) @ Mile High Ballroom 1A - 2A None
Oral Session 2C: Gaussian Splatting & Reconstruction
Oral Session
Fri Jun 05 12:00 PM -- 01:15 PM (PDT) @ Four Seasons Ballroom None
Oral Session 2B: Materials & Lighting
Oral
Fri Jun 05 12:12 PM -- 12:25 PM (PDT) @ Mile High Ballroom 1A - 2A None
MeshSplatting: Differentiable Rendering with Opaque Meshes
Oral
Fri Jun 05 12:12 PM -- 12:25 PM (PDT) @ Bluebird Ballroom None
Natural Human Motion Recovery by Aligning High-Order Temporal Dynamics from Monocular Videos
Oral
Fri Jun 05 12:12 PM -- 12:25 PM (PDT) @ Mile High Ballroom 3A - 4A None
Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
Oral
Fri Jun 05 12:12 PM -- 12:25 PM (PDT) @ Four Seasons Ballroom None
GLINT: Modeling Scene-Scale Transparency via Gaussian Radiance Transport
Oral
Fri Jun 05 12:25 PM -- 12:37 PM (PDT) @ Four Seasons Ballroom None
Neural Field-Based 3D Surface Reconstruction of Microstructures from Multi-Detector Signals in Scanning Electron Microscopy
Oral
Fri Jun 05 12:25 PM -- 12:37 PM (PDT) @ Mile High Ballroom 3A - 4A None
FUSER: Feed-Forward Multiview 3D Registration Transformer and SE(3)^N Diffusion Refinement
Oral
Fri Jun 05 12:25 PM -- 12:37 PM (PDT) @ Bluebird Ballroom None
PoseGAM: Robust Unseen Object Pose Estimation via Geometry-Aware Multi-View Reasoning
Oral
Fri Jun 05 12:25 PM -- 12:37 PM (PDT) @ Mile High Ballroom 1A - 2A None
Proxy-GS: Unified Occlusion Priors for Training and Inference in Structured 3D Gaussian Splatting
Art Program
Fri Jun 05 12:30 PM -- 01:30 PM (PDT) @ Room 201 None
Art Panel
Oral
Fri Jun 05 12:37 PM -- 12:50 PM (PDT) @ Bluebird Ballroom None
SAM 3D Body: Robust Full-Body Human Mesh Recovery
Oral
Fri Jun 05 12:37 PM -- 12:50 PM (PDT) @ Four Seasons Ballroom None
PhyGaP: Physically-Grounded Gaussians with Polarization Cues
Oral
Fri Jun 05 12:37 PM -- 12:50 PM (PDT) @ Mile High Ballroom 3A - 4A None
Residual Primitive Fitting of 3D Shapes with SuperFrusta
Oral
Fri Jun 05 12:37 PM -- 12:50 PM (PDT) @ Mile High Ballroom 1A - 2A None
RetimeGS: Continuous-Time Reconstruction of 4D Gaussian Splatting
Oral
Fri Jun 05 12:50 PM -- 01:02 PM (PDT) @ Bluebird Ballroom None
SAM 3D: 3Dfy Anything in Images
Oral
Fri Jun 05 12:50 PM -- 01:02 PM (PDT) @ Four Seasons Ballroom None
PPISP: Physically-Plausible Compensation and Control of Photometric Variations in Radiance Field Reconstruction
Oral
Fri Jun 05 12:50 PM -- 01:02 PM (PDT) @ Mile High Ballroom 1A - 2A None
Selfi: Self-improving Reconstruction Engine via 3D Geometric Feature Alignment
Oral
Fri Jun 05 12:50 PM -- 01:02 PM (PDT) @ Mile High Ballroom 3A - 4A None
SmokeSVD: Smoke Reconstruction from A Single View via Progressive Novel View Synthesis and Refinement with Diffusion Models
Oral
Fri Jun 05 01:02 PM -- 01:15 PM (PDT) @ Mile High Ballroom 3A - 4A None
SparseWorld-TC: Trajectory-Conditioned Sparse Occupancy World Model
Oral
Fri Jun 05 01:02 PM -- 01:15 PM (PDT) @ Mile High Ballroom 1A - 2A None
Z-Order Transformer for Feed-Forward Gaussian Splatting
Oral
Fri Jun 05 01:02 PM -- 01:15 PM (PDT) @ Bluebird Ballroom None
SPARK: Sim-ready Part-level Articulated Reconstruction with VLM Knowledge
Oral
Fri Jun 05 01:02 PM -- 01:15 PM (PDT) @ Four Seasons Ballroom None
SeeGroup: Multi-Layer Depth Estimation of Transparent Surfaces via Self-Determined Grouping
Break
Fri Jun 05 01:15 PM -- 01:30 PM (PDT) None
Courtesy Break
Keynote
Fri Jun 05 01:45 PM -- 02:45 PM (PDT) @ Bluebird Ballroom None
Programmable Biology: Generative AI for Molecular Design
Poster Setup
Fri Jun 05 02:30 PM -- 03:00 PM (PDT) @ ExHall A None
Poster Setup
Demonstration
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall F None
Demos
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 1
MAMMA: Markerless Accurate Multi-person Motion Acquisition
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 2
Natural Human Motion Recovery by Aligning High-Order Temporal Dynamics from Monocular Videos
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 3
PoseGAM: Robust Unseen Object Pose Estimation via Geometry-Aware Multi-View Reasoning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 4
SAM 3D Body: Robust Full-Body Human Mesh Recovery
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 5
SAM 3D: 3Dfy Anything in Images
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 6
SPARK: Sim-ready Part-level Articulated Reconstruction with VLM Knowledge
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 7
3DReflecNet: A Large-Scale Dataset for 3D Reconstruction of Reflective, Transparent, and Low-Texture Objects
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 8
GLINT: Modeling Scene-Scale Transparency via Gaussian Radiance Transport
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 9
Neural Field-Based 3D Surface Reconstruction of Microstructures from Multi-Detector Signals in Scanning Electron Microscopy
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 10
PhyGaP: Physically-Grounded Gaussians with Polarization Cues
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 11
PPISP: Physically-Plausible Compensation and Control of Photometric Variations in Radiance Field Reconstruction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 12
SeeGroup: Multi-Layer Depth Estimation of Transparent Surfaces via Self-Determined Grouping
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 13
Energy-GS: Image Energy-guided Pose Alignment Gaussian Splatting with redesigned pose gradient flow
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 14
MeshSplatting: Differentiable Rendering with Opaque Meshes
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 15
Proxy-GS: Unified Occlusion Priors for Training and Inference in Structured 3D Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 16
RetimeGS: Continuous-Time Reconstruction of 4D Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 17
Selfi: Self-improving Reconstruction Engine via 3D Geometric Feature Alignment
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 18
Z-Order Transformer for Feed-Forward Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 19
4D Primitive-Mâché: Glueing Primitives for Persistent 4D Scene Reconstruction
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 20
Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 21
FUSER: Feed-Forward Multiview 3D Registration Transformer and SE(3)^N Diffusion Refinement
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 22
Residual Primitive Fitting of 3D Shapes with SuperFrusta
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 23
SmokeSVD: Smoke Reconstruction from A Single View via Progressive Novel View Synthesis and Refinement with Diffusion Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 24
SparseWorld-TC: Trajectory-Conditioned Sparse Occupancy World Model
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 25
Affostruction: 3D Affordance Grounding with Generative Reconstruction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 26
MV-RoMa: From Pairwise Matching into Multi-View Track Reconstruction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 27
Unified Primitive Proxies for Structured Shape Completion
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 28
ART: Articulated Reconstruction Transformer
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 29
SCE-SLAM: Scale-Consistent Monocular SLAM via Scene Coordinate Embeddings
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 30
S2D: Sparse to Dense Lifting for 3D Reconstruction with Minimal Inputs
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 31
Pip-Stereo: Progressive Iterations Pruner for Iterative Optimization based Stereo Matching
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 32
Fast-FoundationStereo: Real-Time Zero-Shot Stereo Matching
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 33
E-RayZer: Self-supervised 3D Reconstruction as Spatial Visual Pre-training
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 34
QVGGT: Post-Training Quantized Visual Geometry Grounded Transformer
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 35
SRGCD: Stability-Driven Region Growth Framework for 3D Change Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 36
D-Prism: Differentiable Primitives for Structured Dynamic Modeling
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 37
STAC: Plug-and-Play Spatio-Temporal Aware Cache Compression for Streaming 3D Reconstruction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 38
Stabilizing Streaming Video Geometry via Dynamic Feature Normalization
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 39
LaS-Comp: Zero-shot 3D Completion with Latent–Spatial Consistency
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 41
EfficientMonoHair: Fast Strand-Level Reconstruction from Monocular Video via Multi-View Direction Fusion
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 42
OSPO: Object-Centric Self-Improving Preference Optimization for Text-to-Image Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 43
MoReGen: Multi-Agent Motion-Reasoning Engine for Code-based Text-to-Video Synthesis
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 44
StyleTextGen: Style-Conditioned Multilingual Scene Text Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 45
CRAFT-LoRA: Content-Style Personalization via Rank-Constrained Adaptation and Training-Free Fusion
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 46
OneHOI: Unifying Human-Object Interaction Generation and Editing
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 47
GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 48
Self-Paced and Self-Corrective Masked Prediction for Movie Trailer Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 49
TV2TV: A Unified Framework for Interleaved Language and Video Generation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 50
Narrative Weaver: Towards Controllable Long-Range Visual Consistency with Multi-Modal Conditioning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 51
Ref4D-VideoBench: Four-Dimensional Reference-Based Evaluation of Text-to-Video Generative Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 52
PureCC: Pure Learning for Text-to-Image Concept Customization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 53
Disentangling to Re-couple: Resolving the Similarity-Controllability Paradox in Subject-Driven Text-to-Image Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 54
Yume1.5: A Text-Controlled Interactive World Generation Model
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 55
PosterReward: Unlocking Accurate Evaluation for High-Quality Graphic Design Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 56
Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 57
SLVMEval: Synthetic Meta Evaluation Benchmark for Text-to-Long Video Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 58
PROMPTMINER: Black-Box Prompt Stealing against Text-to-Image Generative Models via Reinforcement Learning and VLM-Guided Optimization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 59
FlowDirector: Training-Free Flow Steering for Precise Text-to-Video Editing
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 60
Self-Evaluation Unlocks Any-Step Text-to-Image Generation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 61
Say Cheese! Detail-Preserving Portrait Collection Generation via Natural Language Edits
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 62
LVLM-Aided Alignment of Task-Specific Vision Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 63
DeepAlign: Mitigating Modality Conflict through Modality-Specific Alignment
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 64
PG-VTON: Single-Pass Training-Free Virtual Try-On via Patch-Guided Reference Alignment
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 65
Linguistic Priors for Visual Decoupling: Towards Symmetric Vision-Brain Alignment
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 66
Scaling Spatial Intelligence with Multimodal Foundation Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 67
R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 68
SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 69
AVATAR: Reinforcement Learning to See, Hear, and Reason Over Video
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 70
CogniVerse: Revolutionizing Multi-Modal Retrieval-Augmented Generation with Cognitive Reflection and Geometric Reasoning
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 71
FOZO: Forward-Only Zeroth-Order Prompt Optimization for Test-Time Adaptation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 72
Language Does Matter for Cross-Domain Few-Shot Visual Feature Enhancement
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 73
Back to Source: Open-Set Continual Test-Time Adaptation via Domain Compensation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 74
Bridging Domain Expertise and Generalization for Performance Estimation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 75
Adaptive Data Augmentation with Multi-armed Bandit: Sample-Efficient Embedding Calibration for Implicit Pattern Recognition
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 76
Bridging Domains through Subspace-Aware Model Merging
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 77
DA-Mamba: Learning Domain-Aware State Space Model for Global-Local Alignment in Domain Adaptive Object Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 78
Scaling Dense Event-Stream Pretraining from Visual Foundation Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 79
Event Stream Filtering via Probability Flux Estimation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 80
AIMDepth: Asymmetric Image-Event Mamba for Monocular Depth Estimation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 81
Time-Specialized Event-Image Alignment for Blur-to-Video Decomposition
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 82
eRetinexGS: Retinex Modeling for Low-Light Scene Enhancement via Event Streams and 3D Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 83
Unsupervised 3d Motion Estimation Using Event Camera
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 85
ModularAgent: A Task-Aware Modular Framework for Joint Optimization of Multimodal Large Language Models and World Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 86
AstraNav-Memory: Contexts Compression for Long Memory
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 87
Test-Time Perturbation Learning with Delayed Feedback for Vision-Language-Action Models
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 88
OVSegDT: Segmenting Transformer for Open-Vocabulary Object Goal Navigation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 89
ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 90
ActiveVLA: Injecting Active Perception into Vision-Language-Action Models for Precise 3D Robotic Manipulation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 91
ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 92
BridgeEQA: Virtual Embodied Agents for Real Bridge Inspections
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 93
SyncMos: Scalable Motion Synchronisation for Multi-Agent Scene Interaction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 94
Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 95
Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 96
IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images with Generative Visual Prompting
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 97
InstantRetouch: Efficient and High-Fidelity Instruction-Guided Image Retouching with Bilateral Space
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 98
MICON-Bench: Benchmarking and Enhancing Multi-Image Context Image Generation in Unified Multimodal Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 99
The Devil is in Attention Sharing: Improving Complex Non-rigid Image Editing Faithfulness via Attention Synergy
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 100
ShreddingNet: Coarse-to-Fine Restoration for Multi-Source Shredded Manuscripts
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 101
Image Guides Images: Consistent Video Amodal Completion with Rectified In-Context Exemplar Guidance
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 102
Radiance Meshes for Volumetric Reconstruction
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 104
CoRoGS: Contextual Gaussian Splatting for Robust Large-Deviation View Synthesis
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 105
ChronoGS: Disentangling Invariants and Changes in Multi-Period Scenes
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 106
Real-Time Dynamic Scene Rendering with Controlled Compressibility and Contact Awareness
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 107
Splatent: Splatting Diffusion Latents for Novel View Synthesis
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 109
Dynamic-Static Decomposition for Novel View Synthesis of Dynamic Scenes with Spiking Neurons
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 110
DiffSoup: Direct Differentiable Rasterization of Triangle Soup for Extreme Radiance Field Simplification
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 111
Gyro-based Deep Video Deblurring
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 112
Residual Diffusion Bridge Model for Image Restoration
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 113
MMDIR: Multimodal Instruction-Driven Framework for Mixed-Degradation Document Image Restoration
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 114
Rectifying Latent Space for Generative Single-Image Reflection Removal
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 115
Towards Generalized Multimodal Homography Estimation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 116
Edit-aware RAW reconstruction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 117
Face2Scene: Using Facial Degradation as an Oracle for Diffusion-Based Scene Restoration
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 118
HG-Lane: High-Fidelity Generation of Lane Scenes under Adverse Weather and Lighting Conditions without Re-annotation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 120
MR. Illuminate: Zero-Shot Low-Light Image Enhancement with Diffusion Prior
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 121
FoundIR-v2: Optimizing Pre-Training Data Mixtures for Image Restoration Foundation Model
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 122
SPEGC: Continual Test-Time Adaptation via Semantic-Prompt-Enhanced Graph Clustering for Medical Image Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 123
BackSplit: The Importance of Sub-dividing the Background in Biomedical Lesion Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 125
CROWn: A Unified Framework for Anti‑Aliased Downsampling and Phase‑Calibrated Fusion in 3D Medical Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 126
Rethinking Box Supervision: Bias-Free Weakly Supervised Medical Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 127
Semi-supervised Echocardiography Video Segmentation via Anchor Semantic Awareness and Continuous Pseudo-label Reforging
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 130
Breaking Multimodal LLM Safety via Video-Driven Prompting
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 131
When LoRA Betrays: Backdooring Text-to-Image Models by Masquerading as Benign Adapters
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 132
RecoverMark: Robust Watermarking for Localization and Recovery of Manipulated Faces
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 133
A Provable Energy-Guided Test-Time Defense Boosting Adversarial Robustness of Large Vision-Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 134
FORCE: Transferable Visual Jailbreaking Attacks via Feature Over-Reliance CorrEction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 135
PureProof: Diffusion-Resistant Black-box Targeted Attack on Large Vision-Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 136
UniDef: Universal Defense Against Unauthorized Image Manipulation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 137
Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 138
MERLIN: Building Low-SNR Robust Multimodal LLMs for Electromagnetic Signals
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 139
Rethinking Cross-Modal Anchor Alignment for Mitigating Error Accumulation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 141
Omni-MMSI: Toward Identity-attributed Social Interaction Understanding
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 142
Inconsistency-aware Multimodal Schrödinger Bridge for Deepfake Localization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 143
MASQuant: Modality-Aware Smoothing Quantization for Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 144
Seeing Through Touch: Tactile-Driven Visual Localization of Material Regions
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 145
Seeing What Matters: A Training-Free Self-Guided Framework for Multimodal Detail Perception and Reasoning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 146
Illuminating Visual Identity in Universal Multimodal Embeddings
[
Slides]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 147
Anti-Degradation Lifelong Multi-View Clustering
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 148
The Coherence Trap: When MLLM-Crafted Narratives Exploit Manipulated Visual Contexts
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 149
Efficient and High-Fidelity Omni Modality Retrieval
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 150
Same Content, Different Answers: Cross-Modal Inconsistency in MLLMs
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 151
Tri-Subspaces Disentanglement for Multimodal Sentiment Analysis
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 152
HAVE-Bench: Hierarchical Audio-Visual Evaluation from Perception to Interaction
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 153
Predictive Regularization Against Visual Representation Degradation in Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 154
THE MORE, THE MERRIER: CONTRASTIVE FUSION FOR HIGHER-ORDER MULTIMODAL ALIGNMENT
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 155
CineSRD: Leveraging Visual, Acoustic, and Linguistic Cues for Open-World Visual Media Speaker Diarization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 157
UST-Hand: An Uncertainty-aware Spatiotemporal Point Cloud Interaction Network for 3D Self-supervised Hand Pose Estimation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 158
ForeHOI: Feed-forward 3D Object Reconstruction from Daily Hand-Object Interaction Videos
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 159
Hoi! - A Multimodal Dataset for Force-Grounded, Cross-View Articulated Manipulation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 160
Enhancing Hands in 3D Whole-Body Pose Estimation with Conditional Hands Modulator
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 161
TouchDream: 3D Object Completion through Imagined Touch
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 162
ForceVLA2: Unleashing Hybrid Force-Position Control with Force Awareness for Contact-Rich Manipulation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 163
TokenHand: Discrete Token Representation for Efficient Hand Mesh Reconstruction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 164
Artiverse: A Diverse and Physically Grounded Dataset for Articulated Objects
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 165
MatPedia: A Universal Generative Foundation for High-Fidelity Material Synthesis
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 166
LogCD: Local-to-global Consistency Distillation for Few-step Image Generation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 167
EditCtrl: Disentangled Local and Global Control for Real-Time Generative Video Editing
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 168
Anchoring and Rescaling Attention for Semantically Coherent Inbetweening
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 169
FlashMotion: Few-Step Controllable Video Generation with Trajectory Guidance
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 170
LightMover: Generative Light Movement with Color and Intensity Controls
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 171
Parallel Jacobi Decoding for Fast Autoregressive Image Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 172
CARE-Edit: Condition-Aware Routing of Experts for Contextual Image Editing
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 173
CREval: An Automated Interpretable Evaluation for Creative Image Manipulation under Complex Instructions
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 174
EchoVDiff: Cardiac-Cycle Echocardiography Video Generation from Arbitrary Frame
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 175
Re-Align: Structured Reasoning-guided Alignment for In-Context Image Generation and Editing
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 177
Frequency-Aware Flow Matching for High-Quality Image Generation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 178
STARFlow-V: End-to-End Video Generative Modeling with Autoregressive Normalizing Flows
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 179
MixFlow Training: Alleviating Exposure Bias with Slowed Interpolation Mixture
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 180
Improving Controllable Generation: Faster Training and Better Performance via x0-Supervision
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 182
OrionEdit: Bridging Reference and Source Images for Generalized Cross-Image Editing
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 183
PositionIC: Unified Position and Identity Consistency for Image Customization
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 184
P-Flow: Prompting Visual Effects Generation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 185
Clair Obscur: an Illumination-Aware Method for Real-World Image Vectorization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 186
SURF: Signature-Retained Fast Video Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 187
The devil is in the details: Enhancing Video Virtual Try-On via Keyframe-Driven Details Injection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 188
Lynx: Towards High-Fidelity Personalized Video Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 189
VisionDirector: Vision-Language Guided Closed-Loop Refinement for Generative Image Synthesis
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 190
ClusterMark: Towards Robust Watermarking for Autoregressive Image Generators with Visual Token Clustering
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 191
Stable Mean Flow: Lyapunov-Inspired One-Step Flow Matching
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 192
OPRO: Orthogonal Panel-Relative Operators for Panel-Aware In-Context Image Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 193
First Frame Is the Place to Go for Video Content Customization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 194
Scaling Zero-Shot Reference-to-Video Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 195
MotionEdit: Benchmarking and Learning Motion-Centric Image Editing
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 196
VDOT: Efficient Unified Video Creation via Optimal Transport Distillation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 197
Real-Time Generation of Streamable Talking Portrait Video with Reference-Guided Deep Compression VAEs
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 198
RunawayEvil: Jailbreaking the Image-to-Video Generative Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 199
MultiAnimate: Pose-Guided Image Animation Made Extensible
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 200
Translating Signals to Languages for sEMG-Based Activity Recognition
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 201
Open the Motion Door: Atomic Motion Decomposition and Recomposition for Open-Vocabulary Motion Generation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 203
MotionHiFlow: Text-to-Motion via Hierarchical Flow Matching
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 204
LaMoGen: Language to Motion Generation Through LLM-Guided Symbolic Inference
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 206
GVIS: Generative Vector Image Steganography
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 207
MaxMark: High-Capacity Diffusion-Native Watermarking via Robust and Invertible Latent Embedding
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 208
GeoRK2: Geometry-Guided Runge–Kutta Integration for Diffusion Transformer Acceleration
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 209
Test-time Sparsity for Extreme Fast Action Diffusion
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 210
Trainable Log-linear Sparse Attention for Efficient Diffusion Transformers
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 211
A Self-Conditioned Representation Guided Diffusion Model for Realistic Text-to-LiDAR Scene Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 212
When Local Rules Create Global Order: Self-Organized Representation Learning for Latent Diffusion Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 213
ViStoryBench: Comprehensive Benchmark Suite for Story Visualization
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 214
R4-CGQA: Retrieval-based Vision Language Models for Computer Graphics Image Quality Assessment
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 215
A³: Towards Advertising Aesthetic Assessment
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 216
GraphVLM: Benchmarking Vision Language Models for Multimodal Graph Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 217
Phrase-Grounding-Aware Supervised Fine-Tuning for Chart Recognition via Side-Masked Attention
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 219
CLIP Is Shortsighted: Paying Attention Beyond the First Sentence
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 220
G^2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 221
UZ3DVG: Unaided Zero-Shot 3D Visual Grounding with Generated Language Conditions
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 222
LangField4D: Learning Identity-Adaptive and Spatio-Temporal Continuous 4D Language Fields for Dynamic Scenes
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 223
Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 224
CLIPoint3D: Language-Grounded Few-Shot Unsupervised 3D Point Cloud Domain Adaptation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 225
GeoTikzBridge: Advancing Multimodal Code Generation for Geometric Perception and Reasoning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 227
Geometry-Guided 3D Visual Token Pruning for Video-Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 228
Context-Nav: Context-Driven Exploration and Viewpoint-Aware 3D Spatial Reasoning for Instance Navigation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 229
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 230
PanoEnv: Exploring 3D Spatial Intelligence in Panoramic Environments with Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 231
Hilbert-Geo: Solving Solid Geometric Problems by Neural-Symbolic Reasoning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 232
Direction-aware 3D Large Multimodal Models
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 233
CLAY: Conditional Visual Similarity Modulation in Vision-Language Embedding Space
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 234
Tackling Alignment Ambiguity in Person Retrieval through Conversational Attribute Mining
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 235
Beyond Global Similarity: Multi-Conditional Retrieval for Fine-Grained Cross-Modal Understanding
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 236
Imagine Before Concentration: Diffusion-Guided Registers Enhance Partially Relevant Video Retrieval
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 237
What Is the Optimal Ranking Score Between Precision and Recall? We Can Always Find It and It Is Rarely F1
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 238
Robust Remote Sensing Image–Text Retrieval with Noisy Correspondence
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 239
PinPoint: Evaluation of Composed Image Retrieval with Explicit Negatives, Multi-Image Queries, and Paraphrase Testing
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 240
Single-step Diffusion-based Video Coding with Semantic-Temporal Guidance
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 241
Memory Matters: Boosting Training-Free Zero-Shot Temporal Action Localization with a Learnable Lookup Table
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 242
TVHighlights: LLM-Guided Human-Free Collaborative Training for Video Highlight Detection in Movies and TV Dramas
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 243
Color When It Counts: Grayscale-Guided Online Triggering for Always-On Streaming Video Sensing
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 244
Reinforcing Structured Chain-of-Thought for Video Understanding
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 245
FlexiVideo: Variation-Aware Temporal Dynamics Modeling for Efficient Video Understanding
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 246
MS-Temba: Multi-Scale Temporal Mamba for Understanding Long Untrimmed Videos
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 247
Learning Effective Sign Features without Text for Gloss-free Sign Language Translation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 248
META: Meta Evolution of Tool Trajectory Adaptation for Long-Video Understanding
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 249
GT-SVJ: Generative-Transformer-Based Self-Supervised Video Judge For Efficient Video Reward Modeling
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 250
Local Motion Matters: A Deconstruct–Recompose Paradigm for Reinforcement Learning Pre-training from Videos
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 251
Align Once to Explain: Feature Alignment for Scalable B-cosification of Foundational Vision Transformers
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 252
Rounded or Streamlined Head? Bridging Concept Bottleneck Models and Attribute-Described Object Parts
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 253
CIGMA: Causal Information-Gain Mechanistic Attribution of Attention Heads in Vision Transformers
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 254
Rethinking Concept Bottleneck Models: From Pitfalls to Solutions
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 255
Make it SING: Analyzing Semantic Invariants in Classifiers
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 256
Back to the Feature: Explaining Video Classifiers with Video Counterfactual Explanations
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 257
LEADER: Learning Reliable Local-to-Global Correspondences for LiDAR Relocalization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 258
UniCorrn: Unified Correspondence Transformer Across 2D and 3D
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 259
Probabilistic Discrepancy Learning for Roadside LiDAR Scene Completion
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 260
TACO: Task-Aware Contrastive Learning for Joint LiDAR Localization and 3D Object Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 261
Adapting Point Cloud Analysis via Multimodal Bayesian Distribution Learning
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 262
Learning Coordinate-based Convolutional Kernels for Continuous SE(3) Equivariant and Efficient Point Cloud Analysis
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 263
R3-PCQA: Ray-Reprojection-Reinforcement for No-Reference 3D Point Cloud Quality Assessment
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 264
Geometric-Aware Hypergraph Reasoning for Novel Class Discovery in Point Cloud Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 265
PointCSP: Cross-Sample Semantic Propagation and Stability Preservation in Self-Supervised Point Cloud Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 266
U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 267
TerraSeg: Self-Supervised Ground Segmentation for Any LiDAR
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 268
Where Does Vision Meet Language? Understanding and Refining Visual Fusion in MLLMs via Contrastive Attention
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 269
UniRefiner: Teaching Pre-trained ViTs to Self-Dispose Dross via Contrastive Register
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 270
SigLino: Efficient Multi-Teacher Distillation for Agglomerative Vision Foundation Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 271
Heuristic-inspired Reasoning Priors Facilitate Data-Efficient Referring Object Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 272
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 273
AVION: Aerial Vision–Language Instruction from Offline Teacher to Prompt-Tuned Network
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 274
CrossVL: Complexity-Aware Feature Routing and Paired Curriculum for Cross-View Vision-Language Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 275
Masking Teacher and Reinforcing Student for Distilling Vision-Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 276
Role-SynthCLIP: A Role-Play Driven Diverse Synthetic Data Approach
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 277
BiMotion: B-spline Motion for Text-guided Dynamic 3D Character Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 278
PSDesigner: Automated Graphic Design with a Human-Like Creative Workflow
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 279
CADFS: A Big CAD Program Dataset and Framework for Computer-Aided Design with Large Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 280
MapRoute:Precise-Concept Erasing Mappers via Semantic Routing
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 281
PhotoFramer: Multi-modal Image Composition Instruction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 282
Can We Build Scene Graphs, Not Classify Them? FlowSG: Progressive Image-Conditioned Scene Graph Generation with Flow Matching
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 283
DuetSVG: Unified Multimodal SVG Generation with Internal Visual Guidance
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 285
Frequency-domain Manipulation for Face Obfuscation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 286
Towards Reasoning-Preserving Unlearning in Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 287
Erasing Thousands of Concepts: Towards Scalable and Practical Concept Erasure for Text-to-Image Diffusion Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 288
POUR: A Provably Optimal Method for Unlearning Representation via Neural Collapse
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 289
Do Vision-Language Models Leak What They Learn? Adaptive Token-Weighted Model Inversion Attacks
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 290
Protego: User-Centric Pose-Invariant Privacy Protection Against Face Recognition-Induced Digital Footprint Exposure
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 291
SPDMark: Selective Parameter Displacement for Robust Video Watermarking
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 292
Enhancing Visual Representation with Textual Semantics: Textual Semantics-Powered Prototypes for Heterogeneous Federated Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 293
FedHarmony: Harmonizing Heterogeneous Label Correlations in Federated Multi-Label Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 294
FedSST: Rethinking Fair Federated Graph Learning under Structural Shift
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 295
GDFA: Geometry-Driven Federated Unlearning with Directional Task Vector Alignment
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 296
FedARA: Resource-adaptive Low-rank Personalized Federated Learning via Anchor-driven Representation Alignment on Heterogeneous Edge Devices
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 297
InterRVOS: Interaction-Aware Referring Video Object Segmentation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 298
RE-VLM: Event-Augmented Vision-Language Model for Scene Understanding
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 299
RegFormer: Transferable Relational Grounding for Efficient Weakly-Supervised Human-Object Interaction Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 300
Learning to Refuse: Refusal-Aware Reinforcement Fine-Tuning for Hard-Irrelevant Queries in Video Temporal Grounding
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 301
GroundVTS: Visual Token Sampling in Multimodal Large Language Models for Video Temporal Grounding
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 302
TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 303
Tokenization Allows Multimodal Large Language Models to Understand, Generate and Edit Architectural Floor Plans
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 304
MeToM: Metadata-Guided Token Merging for Efficient Video LLMs
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 305
Token Reduction via Local and Global Contexts Optimization for Efficient Video Large Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 306
VLIC: Vision-Language Models As Perceptual Judges for Human-Aligned Image Compression
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 307
Mostly Text, Smart Visuals: Asymmetric Text-Visual Pruning for Large Vision-Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 308
Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient Decoding
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 309
CoIn: Coverage and Informativeness-Guided Token Reduction for Efficient Large Multimodal Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 310
TAMER: A Tri-Modal Contrastive Alignment and Multi-Scale Embedding Refinement Framework for Zero-Shot ECG Diagnosis
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 311
Your Dissimilarities Define You: Complementary Learning Exploiting Class Diversities
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 312
CGU-Bayes: Causal Graph Uncertainty-Guided Bayesian Inference for Domain Generalization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 313
Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 314
Towards Stable Self-Supervised Object Representations in Unconstrained Egocentric Video
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 315
LRDUN: A Low-Rank Deep Unfolding Network for Efficient Spectral Compressive Imaging
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 316
Neural Collapse in Test-Time Adaptation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 317
CLEX: Complementary Label Exchange Learning for Noisy Facial Expression Recognition
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 318
TruckDrive: Long-Range Autonomous Highway Driving Dataset
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 319
Neuro-Cognitive Reward Modeling for Human-Centered Autonomous Vehicle Control
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 320
E3AD: An Emotion-Aware Vision-Language-Action Model for Human-Centric End-to-End Autonomous Driving
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 321
The Blind Spot of Adaptation: Quantifying and Mitigating Forgetting in Fine-tuned Driving Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 322
Den-TP: A Density-Balanced Data Curation and Evaluation Framework for Trajectory Prediction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 323
Percept-WAM: Perception-Enhanced World-Awareness-Action Model for Robust End-to-End Autonomous Driving
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 324
GaussianDWM: 3D Gaussian Driving World Model for Unified Scene Understanding and Multi-Modal Generation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 325
Mind the Hitch: Dynamic Calibration and Articulated Perception for Autonomous Trucks
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 327
Beyond Rule-Based Agents: Active Markov Games for Realistic Multi-Agent Interaction in Autonomous Driving
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 328
Test-Time Multi-Prompt Adaptation for Open-Vocabulary Remote Sensing Image Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 329
ReScene4D: Temporally Consistent Semantic Instance Segmentation of Evolving Indoor 3D Scenes
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 330
CrackSSM: Reviving SSMs for Crack Segmentation via Dynamic Scanning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 331
BiPA: Bilevel Prompt Adaptation for Underwater Instance Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 332
RS-SSM: Refining Forgotten Specifics in State Space Model for Video Semantic Segmentation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 333
Scene-Centric Unsupervised Video Panoptic Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 334
Bootstrapping Video Semantic Segmentation Model via Distillation-assisted Test-Time Adaptation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 335
GeoFree-CoSeg: Unsupervised Point Cloud-Image Cross-Modal Co-Segmentation Without Geometric Alignment
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 336
Parameter-efficient Continual Learning for Enhancing Plasticity without Forgetting under Limited Model Capacity
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 337
Dual-Estimator: Decoupling Global and Local Semantic Shift for Drift Compensation in Class-Incremental Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 338
Continual Distillation of Teachers from Different Domains
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 339
Multimodal Continual Instruction Tuning with Dynamic Gradient Guidance
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 340
Learning from Itself: Mining Internal Knowledge from Vision Language Models for Continual Learning
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 341
AdaPrior: Bayesian-Inspired Adaptive Prior Correction for Long-Tailed Continual Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 342
An Optimal Transport-driven Approach for Cultivating Latent Space in Online Incremental Learning
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 343
HAD: Heterogeneity-Aware Distillation for Lifelong Heterogeneous Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 344
U-Mind: A Unified Framework for Real-Time Multimodal Interaction with Audiovisual Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 345
StreamAvatar: Streaming Diffusion Models for Real-Time Interactive Human Avatars
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 346
FlashLips: 100-FPS Mask-Free Latent Lip-Sync using Reconstruction Instead of Diffusion or GANs
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 347
WildCap: Facial Albedo Capture in the Wild via Hybrid Inverse Rendering
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 348
EmoTaG: Emotion-Aware Talking Head Synthesis on Gaussian Splatting with Few-Shot Personalization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 349
DyaDiT: A Multi-Modal Diffusion Transformer for Socially Favorable Dyadic Gesture Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 350
TRM-VLA: Temporal-Aware Chain-of-Thought Reasoning and Memorization for Vision-Language-Action Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 351
VGGDrive: Empowering Vision-Language Models with Cross-View Geometric Grounding for Autonomous Driving
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 352
NoRD: A Data-Efficient Vision-Language-Action Model that Drives without Reasoning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 353
HTNav: A Hybrid Navigation Framework with Tiered Structure for Urban Aerial Vision-and-Language Navigation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 354
CycleBEV: Regularizing View Transformation Networks via View Cycle Consistency for Bird’s-Eye-View Semantic Segmentation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 355
STAvatar: Soft Binding and Temporal Density Control for Monocular 3D Head Avatars Reconstruction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 356
CrowdGaussian: Reconstructing High-Fidelity 3D Gaussians for Human Crowd from a Single Image
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 357
OMG-Avatar: One-shot Multi-LOD Gaussian Head Avatar
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 358
Globally Optimal Pose from Orthographic Silhouettes
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 359
AvatarPointillist: AutoRegressive 4D Gaussian Avatarization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 360
COPO: Causal-Oriented Policy Optimization for Hallucinations of MLLMs
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 361
Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 362
AdaIAT: Adaptively Increasing Attention to Generated Text to Alleviate Hallucinations in LVLM
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 363
HulluEdit: Single-Pass Evidence-Consistent Subspace Editing for Mitigating Hallucinations in Large Vision-Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 364
SEASON: Mitigating Temporal Hallucination in Video Large Language Models via Self-Diagnostic Contrastive Decoding
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 365
One Token, Two Fates: A Unified Framework via Vision Token Manipulation Against MLLMs Hallucination
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 366
EgoX: Egocentric Video Generation from a Single Exocentric Video
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 367
SymphoMotion: Joint Control of Camera Motion and Object Dynamics for Coherent Video Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 368
Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 369
SeeU: Seeing the Unseen World via 4D Dynamics-aware Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 371
Scaling4D: Pushing the Frontier of Video Novel View Synthesis through Large-Scale Monocular Videos
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 372
PHANTOM: Physics-Infused Video Generation via Joint Modeling of Visual and Latent Physical Dynamics
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 373
WorldReel: 4D Video Generation with Consistent Geometry and Motion Modeling
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 374
Let Your Image Move with Your Motion! -- Implicit Multi-Object Multi-Motion Transfer
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 375
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 376
D2FANet: Enhancing Video Object Detection with Dual-Domain Feature Aggregation Network
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 377
HierUQ: Hierarchical Uncertainty Quantification with Adaptive Granularity Reconciliation for Degraded Image Classification
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 378
ID-Sim: An Identity-Focused Similarity Metric
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 379
Hier-COS: Making Deep Features Hierarchy-aware via Composition of Orthogonal Subspaces
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 380
Towards Cross-Modal Preservation, Consistency and Alignment for Privacy-Preserving Visible-Infrared Person Re-Identification
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 382
COPE: Consistent Occlusion and Prompt Enhancement Network for Occluded Person Re-identification
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 383
Assignment-Driven Hash Learning in a Hyper-Semantic Space for On-the-Fly Category Discovery
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 384
DyFCLT: Dynamic Frequency-Decoupled Cross-Modal Learning Transformer for Multimodal Tiny Object Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 385
EW-DETR: Evolving World Object Detection via Incremental Low-Rank DEtection TRansformer
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 386
Building a Precise Video Language with Human–AI Oversight
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 387
CoCoVideo: The High-Quality Commercial-Model-Based Contrastive Benchmark for AI-Generated Video Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 388
Towards Sparse Video Understanding and Reasoning
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 389
Divide, then Ground: Adapting Frame Selection to Query Types for Long-Form Video Understanding
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 390
MuKV: Multi-Grained KV Cache Compression for Long Streaming Video Question-Answering
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 391
ParallelVLM: Lossless Video-LLM Acceleration with Visual Alignment Aware Parallel Speculative Decoding
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 392
TiViBench: Benchmarking Think-in-Video Reasoning for Video Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 393
What Are You Doing? A Closer Look at Controllable Human Video Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 394
Score2Instruct: Scaling Up Video Quality-Centric Instructions via Automated Dimension Scoring
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 395
CFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 396
Towards Holistic Modeling for Video Frame Interpolation with Auto-regressive Diffusion Transformers
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 397
DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 398
Towards High-resolution and Disentangled Reference-based Sketch Colorization
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 399
MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 400
Layer-wise Instance Binding for Regional and Occlusion Control in Text-to-Image Diffusion Transformers
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 401
Memory-Efficient Fine-Tuning Diffusion Transformers via Dynamic Patch Sampling and Block Skipping
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 402
COT-FM: Cluster-wise Optimal Transport Flow Matching
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 403
Interpretable Motion-Attentive Maps: Spatio-Temporally Localizing Concepts in Video Diffusion Transformers
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 404
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 405
CoopDiff: A Diffusion-Guided Approach for Cooperation under Corruptions
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 406
RARE: Learn to RAnk and REtrieve for Monocular 3D Object Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 407
COG: Confidence-aware Optimal Geometric Correspondence for Unsupervised Single-reference Novel Object Pose Estimation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 408
Learnability-Driven Submodular Optimization for Active Roadside 3D Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 409
Look Before You Fuse: 2D-Guided Cross-Modal Alignment for Robust 3D Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 410
Long-SCOPE: Fully Sparse Long-Range Cooperative 3D Perception
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 411
Dynamics-Aware Preference Optimization for Vision-Language Models
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 413
Learning What Helps: Task-Aligned Context Selection for Vision Tasks
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 415
NeuroRule: Bridging Vision and Logic with Differentiable Rule Induction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 416
Beyond Graph Model: Reliable VLM Fine-Tuning via Random Graph Adapter
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 417
Ego: Embedding-Guided Personalization of Vision-Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 418
JoPPO: Hierarchical Photography Assessment via Contrastive Joint Conditional Probabilistic Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 419
AeroAgent: A Vision–Physics–Decision Framework for Aerodynamic Vehicle Design
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 420
MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 421
Prune Wisely, Reconstruct Sharply: Compact 3D Gaussian Splatting via Adaptive Pruning and Difference-of-Gaussian Primitives
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 422
MSCD-GS: Motion-Separated Cooperative Deblurring Dynamic Reconstruction via Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 423
P2GS: Physical Prior-guided Gaussian Splatting for Photometrically Consistent Urban Reconstruction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 424
iSplat: Iterative Learning for Fine-Grained Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 425
Off The Grid: Detection of Primitives for Feed-Forward 3D Gaussian Splatting
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 426
MAPo: Motion-Aware Partitioning of Deformable 3D Gaussian Splatting for High-Fidelity Dynamic Scene Reconstruction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 427
FreeArtGS: Articulated Gaussian Splatting Under Free-moving Scenario
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 428
HeroGS: Hierarchical Guidance for Robust 3D Gaussian Splatting under Sparse Views
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 429
SharpTimeGS: Sharp and Stable Dynamic Gaussian Splatting via Lifespan Modulation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 430
Physically Inspired Gaussian Splatting for HDR Novel View Synthesis
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 431
PhysIR-Splat: Physically Consistent Thermal Infrared Radiative Transfer in 3D Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 432
4C4D: 4 Camera 4D Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 433
SplatSuRe: Selective Super-Resolution for Multi-view Consistent 3D Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 434
GaussianZoom: Progressive Zoom-in Generative 3D Gaussian Splatting with Geometric and Semantic Guidance
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 435
MotionScale: Reconstructing Appearance, Geometry, and Motion of Dynamic Scenes with Scalable 4D Gaussian Splatting
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 436
PRIMU: Uncertainty Estimation for Novel Views in Gaussian Splatting from Primitive-Based Representations of Error and Coverage
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 437
TGSFormer: Scalable Temporal Gaussian Splatting for Embodied Semantic Scene Completion
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 438
Disco-GS: Gaussian Splatting in Dynamic Color Lighting
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 439
ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 440
GuardTrace-VL: Detecting Unsafe Multimodel Reasoning via Iterative Safety Supervision
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 441
AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 442
See It, Say It, Sorted: An Iterative Training-Free Framework for Visually-Grounded Multimodal Reasoning in LVLMs
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 443
Will Multimodal Models Be Dazzled by Multi-Image Visual Puzzles?
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 444
GThinker: Towards General Multimodal Reasoning via Cue-Guided Rethinking
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 445
Visual Grounding for Object Questions
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 446
CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal Reasoning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 447
What Do Visual Tokens Really Encode? Uncovering Sparsity and Redundancy in Multimodal Large Language Models
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 448
Think-as-You-See: Streaming Chain-of-Thought Reasoning for Large Vision-Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 449
Stable and Efficient Single-Rollout RL for Multimodal Reasoning
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 450
Revisiting the Necessity of Lengthy Chain-of-Thought in Vision-centric Reasoning Generalization
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 451
Monet: Reasoning in Latent Visual Space Beyond Image and Language
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 452
STAR-R1: Multi-View Spatial TrAnsformation Reasoning by Reinforcing Multimodal LLMs
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 453
From Where Things Are to What They Are For: Benchmarking Spatial–Functional Intelligence in Multimodal LLMs
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 454
Deeper Thought, Weaker Aim: Understanding and Mitigating Perceptual Impairment during Reasoning in Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 455
S2D: Selective Spectral Decay for Quantization-Friendly Conditioning of Neural Activations
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 456
OneSparse: A Unified Framework for Sparse Activation Layers in Vision Models
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 457
What Matters in Practical Learned Image Compression
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 458
BinaryAttention: One-Bit QK-Attention for Vision and Diffusion Transformers
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 460
LazyVAR: Accelerating Visual Autoregressive Models via Scale-wise Token Pruning and Parallel Group Decoding
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 461
Spk2VidNet: A Hierarchical Recurrent Architecture for High-Fidelity Video Reconstruction from Long Spike-Camera Streams
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 462
Adaptive Learned Image Compression with Graph Neural Networks
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 463
SGI: Structured 2D Gaussians for Efficient and Compact Large Image Representation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 464
VVS: Accelerating Speculative Decoding for Visual Autoregressive Generation via Partial Verification Skipping
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 465
HypeVPR: Exploring Hyperbolic Space for Perspective to Equirectangular Visual Place Recognition
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 467
CoLoR: The Devil is in Scene Coordinate Regression for Large-Scale Visual Localization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 468
Affine Perspective-Three-Point Problem
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 469
Sky2Ground: A Benchmark for Site Modeling under Varying Altitude
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 470
SemanticVLA: Towards Semantic Reasoning over Action Memorization via Synergistic Explicit Trace and Latent Action Planning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 471
WebGym: Scaling Training Environments for Long-Horizon Visual Web Agents with Realistic Tasks
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 472
Beyond Perceptual Shortcuts: Causal-Inspired Debiasing Optimization for Generalizable Video Reasoning in Lightweight MLLMs
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 473
APPO: Attention-guided Perception Policy Optimization for Video Reasoning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 474
RetouchIQ: MLLM Agents for Instruction-Based Image Retouching with Generalist Reward
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 475
EVA: Efficient Reinforcement Learning for End-to-End Video Agent
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 476
Visual Document Understanding and Reasoning: A Multi-Agent Collaboration Framework with Agent-Wise Adaptive Test-Time Scaling
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 477
GazeOnce360: Fisheye-Based 360° Multi-Person Gaze Estimation with Global–Local Feature Fusion
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 478
Bridging Human Evaluation to Infrared and Visible Image Fusion
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 479
Beyond Strict Pairing: Arbitrarily Paired Training for High-Performance Infrared and Visible Image Fusion
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 480
Semantic-Adaptive Diffusion for Dynamic Spatiotemporal Fusion
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 481
Bayesian Decomposition and Semantic Completion for Few-shot Semantic Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 482
From Few-way to Many-way: Rethinking Few-shot Fine-grained Image Classification
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 484
Selective, Regularized, and Calibrated: Harnessing Vision Foundation Models for Cross-Domain Few-Shot Semantic Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 485
FlowComposer: Composable Flows for Compositional Zero-Shot Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 486
ManifoldGD: Training-Free Hierarchical Manifold Guidance for Diffusion-Based Dataset Distillation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 487
DMGD: Train-Free Dataset Distillation with Semantic-Distribution Matching in Diffusion Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 488
UniRain: Unified Image Deraining with RAG-based Dataset Distillation and Multi-objective Reweighted Optimization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 489
Leveraging Multispectral Sensors for Color Correction in Mobile Cameras
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 490
Differentiable Adaptive 4D Structured Illumination for Joint Capture of Shape and Reflectance
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 491
Optical Diffraction-based Convolution for Semiconductor Lithography
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 492
GSNR: Graph Smooth Null-Space Representation for Inverse Problems
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 493
MatE: Material Extraction from Single-Image via Geometric Prior
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 494
αMatte4K & µMatting: Dataset and Model for Ultra-Micro Precision Alpha Video Matting
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 495
Revisiting Optimal Coding for I-ToF under Practical Sensor Constraints
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 496
Dynamic Black-hole Emission Tomography with Physics-informed Neural Fields
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 497
Exploring Spatiotemporal Feature Propagation for Video-Level Compressive Spectral Reconstruction: Dataset, Model and Benchmark
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 499
SAR2Net: Learning Spatially Anchored Representations for Retrieval-Guided Cross-Stain Alignment
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 500
Advancing Cancer Prognosis with Hierarchical Fusion of Genomic, Proteomic and Pathology Imaging Data from a Systems Biology Perspective
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 501
PromptStereo: Zero-Shot Stereo Matching via Structure and Motion Prompts
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 502
Any Resolution Any Geometry: From Multi-View To Multi-Patch
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 503
Paparazzo: Active Mapping of Moving 3D Objects
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 504
DepthFocus: Controllable Depth Estimation for See-Through Scenes
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 505
OVI-MAP: Open-Vocabulary Instance-Semantic Mapping
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 506
PTC-Depth: Pose-Refined Monocular Depth Estimation with Temporal Consistency
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 507
SceneScribe-1M: A Large-Scale Video Dataset with Comprehensive Geometric and Semantic Annotations
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 508
Omni-3DEdit: Generalized Versatile 3D Editing in One-Pass
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 510
Variational Graph-based Normal Integration
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 511
Vinedresser3D: Towards Agentic Text-guided 3D Editing
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 512
MV2UV: Generating High-quality UV Texture Maps with Multiview Prompts
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 513
Learning Hierarchical Hyperbolic Mixture Model for Part-aware 3D Generation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 514
MeshRipple: Structured Autoregressive Generation of Artist-Meshes
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 515
FACE: A Face-based Autoregressive Representation for High-Fidelity and Efficient Mesh Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 516
Easy3E: Feed-Forward 3D Asset Editing via Rectified Voxel Flow
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 517
CUPID: Generative 3D Reconstruction via Joint Object and Pose Modeling
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 518
3D-Fixer: Coarse-to-Fine In-place Completion for 3D Scenes from a Single Image
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 519
DRM: Diffusion-based Reward Model With Step-wise Guidance
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 520
Taming Preference Mode Collapse via Directional Decoupling Alignment in Diffusion Reinforcement Learning
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 521
VA-π: Variational Policy Alignment for Pixel-Aware Autoregressive Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 522
SoliReward: Mitigating Susceptibility to Reward Hacking and Annotation Noise in Video Generation Reward Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 523
AnyID: Ultra-Fidelity Universal Identity-Preserving Video Generation from Any Visual References
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 524
Style-GRPO: Semantic-Aware Preference Optimization for Image Style Transfer Guided by Reward Modeling
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 525
LAMP: Language-Assisted Motion Planning for Controllable Video Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 526
Diverse Video Generation with Determinantal Point Process-Guided Policy Optimization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 527
Spectral Scalpel: Amplifying Adjacent Action Discrepancy via Frequency-Selective Filtering for Skeleton-Based Action Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 528
DETACH : Decomposed Spatio-Temporal Alignment for Exocentric Video and Ambient Sensors with Staged Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 529
Learning a Unified Latent Action Space from Videos with Action-centric Cycle Consistency
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 530
VideoNet: A Large-Scale Dataset for Domain-Specific Action Recognition
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 531
BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 533
Spherical Leech Quantization for Visual Tokenization and Generation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 534
MSPT: Efficient Large-Scale Physical Modeling via Parallelized Multi-Scale Attention
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 535
GR-Gauge: Cost-efficient Training Configuration By Gauging the Gradient Redundancy
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 536
E^2-SCI: Elastic Edge–Cloud Speculative Decoding via Credit Inertia
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 537
HyperNAS: Enhancing Architecture Representation for NAS Predictor via Hypernetwork
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 538
NeuroFlow: Toward Unified Visual Encoding and Decoding from Neural Activity
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 539
Spectral Conformal Risk Control: Distribution-Free Tail Guarantees via Bayesian Quadrature
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 540
Edge-RecViT: Efficient Vision Transformer via Semantic-Refined Dynamic Recursion
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 541
ERMoE: Eigen-Reparameterized Mixture-of-Experts for Stable Routing and Interpretable Specialization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 542
GUI-SAGE: Enhancing GUI Automation with Self-Explanatory Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 543
GUIDE: A Benchmark for Understanding and Assisting Users in Open-Ended GUI Tasks
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 544
HiconAgent: History Context-aware Policy Optimization for GUI Agents
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 545
PET-DINO: Unifying Visual Cues into Grounding DINO with Prompt-Enriched Training
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 546
SDDF: Specificity-Driven Dynamic Focusing for Open-Vocabulary Camouflaged Object Detection
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 549
Prompt-Free Universal Region Proposal Network
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 551
PaNDaS: Learnable Shape Interpolation Modeling with Localized Control
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 552
Hilbert Curve-Based Attention Enabling Topology-Preserving Image Tensor Representation for Semantic Segmentation Network
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 553
Towards High-Quality Image Segmentation: Improving Topology Accuracy by Penalizing Neighbor Pixels
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 554
SAGE: Style-Adaptive Generalization for Privacy-Constrained Semantic Segmentation Across Domains
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 555
Better than Average: Spatially-Aware Aggregation of Segmentation Uncertainty Improves Downstream Performance
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 556
Universal 3D Shape Matching via Coarse-to-Fine Language Guidance
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 557
Direct Segmentation without Logits Optimization for Training-Free Open-Vocabulary Semantic Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 558
CDICS: Delving Into Fine-Grained Attribute for In-Context Segmentation via Compositional Prompts and Phased Decoupling
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 559
Discriminative Perception via Anchored Description for Reasoning Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 560
SegEarth-R2: Towards Comprehensive Language-guided Segmentation for Remote Sensing Images
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 561
Cross-Scale Pansharpening via ScaleFormer and the PanScale Benchmark
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 562
CrossEarth-Gate: Fisher-Guided Adaptive Tuning Engine for Efficient Adaptation of Cross-Domain Remote Sensing Semantic Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 563
Multigrain-aware Semantic Prototype Scanning and Tri-Token Prompt Learning Embraced High-Order RWKV for Pan-Sharpening
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 564
ACPV-Net: All-Class Polygonal Vectorization for Seamless Vector Map Generation from Aerial Imagery
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 565
Beyond Endpoints: Path-Centric Reasoning for Vectorized Off-Road Network Extraction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 566
Rotation Invariant and Symmetry Aware Pixel Difference Network for Remote Sensing Object Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 567
F2Net: A Frequency-Fused Network for Ultra-High Resolution Remote Sensing Segmentation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 568
RoadGIE: Towards A Global-Scale Aerial Benchmark for Generalizable Interactive Road Extraction
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 569
PGA: Prior-free Generative Attack for Practical No-box Scenario
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 570
Lipschitz Optimization for Formal Verification of Homographies
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 571
Batman: Benign Knowledge Alignment Through Malicious Null Space in Federated Backdoor Attack
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 572
Out of Sight, Out of Track: Adversarial Attacks on Propagation-based Multi-Object Trackers via Query State Manipulation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 573
Eliminate Distance Differences Induced by Backdoor Attacks: Layer-Selective Training and Clipping to Mask Backdoor Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 574
Mitigating Error Amplification in Fast Adversarial Training
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 575
Physical Adversarial Clothing Evades Visible-Thermal Detectors via Non-Overlapping RGB-T Pattern
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 576
What Your Features Reveal: Data-Efficient Black-Box Feature Inversion Attack for Split DNNs
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 577
Exposing Functional Fusion: A New Class of Strategic Backdoor in Dynamic Prompt Architectures
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 578
Learning to See and Act: Task-Aware Virtual View Exploration for Robotic Manipulation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 579
Evo-1: Lightweight Vision-Language-Action Model with Preserved Semantic Alignment
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 580
FM-Steer: Enhance Generalist Policies with Value-Guided Cascaded Denoising
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 581
Bootstrap Dynamic-Aware 3D Visual Representation for Scalable Robot Learning
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 582
Visual Sim-to-Real at Scale for Humanoid Loco-Manipulation
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 583
Contact-Aware Neural Dynamics
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 584
AVA-VLA: Improving Vision-Language-Action models with Active Visual Attention
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 585
UAST: Unified Active Search and Tracking for Arbitrary Targets with UAVs
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 586
SwiftVLA: Unlocking Spatiotemporal Dynamics for Lightweight VLA Models at Minimal Overhead
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 587
Visual-RRT: Finding Paths toward Visual-Goals via Differentiable Rendering
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 588
Cross-Hand Latent Representation for Vision-Language-Action Models
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 589
Beyond Success: Refining Elegant Robot Manipulation from Mixed-Quality Data via Just-in-Time Intervention
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 590
Physically Ground Commonsense Knowledge for Articulated Object Manipulation with Analytic Concepts
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 591
GeoPredict: Leveraging Predictive Kinematics and 3D Gaussian Geometry for Precise VLA Manipulation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 592
From Manuals to Actions: A Unified VLA Model for Chain-of-Thought Manual Generation and Robotic Manipulation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 594
Rethinking Occlusion Modeling for UAV Tracking
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 595
Adaptive Capacity Autoregressive Visual Tracking
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 596
Spatio-Temporal Conditional Denoising Transformer for Modality-Missing RGBT Tracking
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 597
Breaking Smooth-Motion Assumptions: A UAV Benchmark for Multi-Object Tracking in Complex and Adverse Conditions
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 598
TrackMAE: Video Representation Learning via Track Mask and Predict
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 599
Dual-branch Distilled Transformer for Efficient Asymmetric UAV Tracking
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 600
Multi-view Crowd Tracking Transformer with View-Ground Interactions Under Large Real-World Scenes
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 601
Scaling Self-Supervised and Cross-Modal Pretraining for Volumetric CT Transformers
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 602
MuViT: Multi-Resolution Vision Transformers for Learning Across Scales in Microscopy
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 603
SemVideo: Reconstructs What You Watch from Brain Activity via Hierarchical Semantic Guidance
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 604
Multimodal Causality-Driven Representation Learning for Generalizable Medical Image Segmentation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 605
Simple Agents Outperform Experts in Biomedical Imaging Workflow Optimization
[
Slides]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 606
TopoSlide: Topologically-Informed Histopathology Whole Slide Image Representation Learning
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 607
Beyond the Static-World: Lifelong Learning for All-in-One Medical Image Restoration
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 608
Hyperbolic Relational Prompts for Intersectional Fairness in Medical VLMs
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 609
RNED: Rotary Number Encoding and Decoding for Quantitative Medical VLM Analysis
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 610
MLLM-HWSI: A Multimodal Large Language Model for Hierarchical Whole Slide Image Understanding
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 611
Learning Generalizable 3D Medical Image Representations from Mask-Guided Self-Supervision
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 612
BiOTPrompt: Bidirectional Optimal Transport Guided Prompting for Disease Evolution-aware Radiology Report Generation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 613
Learning to See Through a Baby’s Eyes: Early Visual Diets Enable Robust Visual Intelligence in Humans and Machines
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 614
UDAPose: Unsupervised Domain Adaptation for Low-Light Human Pose Estimation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 615
Enhancing Accuracy of Uncertainty Estimation in Appearance-based Gaze Tracking with Probabilistic Evaluation and Calibration
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 616
SCAPO: Self-Supervised Category-Level Articulated Pose Estimation from a Single 3D Observation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 617
Composite-Attribute Person Re-Identification via Pose-Guided Disentanglement
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 618
Representing 3D Faces with Learnable B-Spline Volumes
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 619
RHINO: Reconstructing Human Interactions with Novel Objects from Monocular Videos
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 620
HumanBA: Human-Aware Bundle Adjustment via Global Human-Camera Decoupling
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 621
HamiPose: Hamiltonian Optimization for Unsupervised Domain Adaptive Pose Estimation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 622
KASALv2: Fully Automatic 3D Rotational Symmetry Classification and Axis Localization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 623
AnyLift: Scaling Motion Reconstruction from Internet Videos via 2D Diffusion
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 624
Active Inference for Micro-Gesture Recognition: EFE-Guided Temporal Sampling and Adaptive Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 625
ArtPro: Self-Supervised Articulated Object Reconstruction with Adaptive Integration of Mobility Proposals
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 626
Similarity-Consistent Likelihood Diffusion enables Hidden Person Detection from Wall Reflections
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 627
VLM-Guided Group Preference Alignment for Diffusion-based Human Mesh Recovery
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 628
Occluded Human Body Capture with Frequency Domain Denoising Prior
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 629
ResiHMR: Residual-Limb Aware Single-Image 3D Human Mesh Recovery for Individuals with Limb Loss
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 630
OnlineHMR: Video-based Online World-Grounded Human Mesh Recovery
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 631
MimiCAT: Mimic with Correspondence-Aware Cascade-Transformer for Category-Free 3D Pose Transfer
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 632
Exploring Adaptive Masked Reconstruction for Self-Supervised Skeleton-Based Action Recognition
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 633
DFD-HR: Generalizable Deepfake Detection via Hierarchical Routing Learning
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 634
MGDHand: Multi-Granularity Prior-to-Inertial Distillation Framework for Sequential 3D Hand Pose Estimation from Sparse IMUs
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 635
CARI4D: Category Agnostic 4D Reconstruction of Human-Object Interaction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 636
E-3DPSM: A State Machine for Event-based Egocentric 3D Human Pose Estimation
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 637
Bézier Degradation Modeling for LiDAR-based Human Motion Capture
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 638
UniSH: Unifying Scene and Human Reconstruction in a Feed-Forward Pass
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 639
Illumination-Consistent Human-Scene Reconstruction from Monocular Video
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 640
Attribution as Retrieval: Model-Agnostic AI-Generated Image Attribution
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 641
Agent4FaceForgery: Multi-Agent LLM Framework for Realistic Face Forgery Detection
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 642
Enabling Supervised Learning of Generative Signatures for Generalized Synthetic Image Detection
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 643
DiffusionFF: A Diffusion-based Framework for Joint Face Forgery Detection and Fine-Grained Artifact Localization
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 644
All in One: Unifying Deepfake Detection, Tampering Localization, and Source Tracing with a Robust Landmark-Identity Watermark
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 645
Towards an Incremental Unified Multimodal Anomaly Detection: Augmenting Multimodal Denoising From an Information Bottleneck Perspective
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 646
AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 647
Dual-Prototype-Guided Multi-task Learning for Unsupervised Anomaly Detection and Classification
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 648
The Road Less Seen: Segment Exploration for Weakly Supervised Video Anomaly Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 649
Omni-AD: A Large-scale and Versatile Benchmark for Industrial Anomaly Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 650
Back to Point: Exploring Point-Language Models for Zero-Shot 3D Anomaly Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 651
Complementary Prototype Mapping for Efficient Multimodal Anomaly Detection
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 652
LiDAS: Lighting-driven Dynamic Active Sensing for Nighttime Perception
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 653
Gau-Occ: Geometry-Completed Gaussians for Multi-Modal 3D Occupancy Prediction
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 654
OpenVO: Open-World Visual Odometry with Temporal Dynamics Awareness
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 655
An Instance-Centric Panoptic Occupancy Prediction Benchmark for Autonomous Driving
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 656
OneOcc: Semantic Occupancy Prediction for Legged Robots with a Single Panoramic Camera
[
Poster]
Poster
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F 657
ProOOD: Prototype-Guided Out-of-Distribution 3D Occupancy Prediction
[
Poster]
Poster Session
Fri Jun 05 03:00 PM -- 05:00 PM (PDT) @ ExHall A & F None
Poster Session 2 & Exhibit Hall w/ Coffee Break
Art Program
Fri Jun 05 04:00 PM -- 04:30 PM (PDT) @ ExHall F None
Art Gallery Tour with Curator and Artists
Break
Sat Jun 06 06:30 AM -- 08:00 AM (PDT) @ ExHall C None
Breakfast
Registration
Sat Jun 06 06:30 AM -- 04:00 PM (PDT) @ Lobby A None
Registration / Badge Pickup
Oral
Sat Jun 06 08:00 AM -- 08:12 AM (PDT) @ Four Seasons Ballroom None
ComPose: A Unified Completion-Pose Framework for Robust Category-Level Object Pose Estimation
Oral
Sat Jun 06 08:00 AM -- 08:12 AM (PDT) @ Mile High Ballroom 1A - 2A None
3D-LATTE: Latent Space 3D Editing from Textual Instructions
Oral
Sat Jun 06 08:00 AM -- 08:12 AM (PDT) @ Mile High Ballroom 3A - 4A None
Differentiable Vector Quantization for Rate-Distortion Optimization of Generative Image Compression
Oral
Sat Jun 06 08:00 AM -- 08:12 AM (PDT) @ Bluebird Ballroom None
Breaking Semantic Boundaries: Distribution-Guided Semantic Exploration for Creative Generation
Oral Session
Sat Jun 06 08:00 AM -- 09:15 AM (PDT) @ Four Seasons Ballroom None
Oral Session 3B: Spatial Understanding
Oral Session
Sat Jun 06 08:00 AM -- 09:15 AM (PDT) @ Mile High Ballroom 1A - 2A None
Oral Session 3C: Generative Editing
Oral Session
Sat Jun 06 08:00 AM -- 09:15 AM (PDT) @ Bluebird Ballroom None
Oral Session 3A: Generative Diffusion Modeling
Oral Session
Sat Jun 06 08:00 AM -- 09:15 AM (PDT) @ Mile High Ballroom 3A - 4A None
Oral Session 3D: Multimodal Modeling
Oral
Sat Jun 06 08:12 AM -- 08:25 AM (PDT) @ Mile High Ballroom 1A - 2A None
AnchorFlow: Training-Free 3D Editing via Latent Anchor-Aligned Flows
Oral
Sat Jun 06 08:12 AM -- 08:25 AM (PDT) @ Four Seasons Ballroom None
CoSMo3D: Open-World Promptable 3D Semantic Segmentation through LLM-Guided Canonical Spatial Modeling
Oral
Sat Jun 06 08:12 AM -- 08:25 AM (PDT) @ Mile High Ballroom 3A - 4A None
FINER: MLLMs Hallucinate under Fine-grained Negative Queries
Oral
Sat Jun 06 08:12 AM -- 08:25 AM (PDT) @ Bluebird Ballroom None
Guiding a Diffusion Model by Swapping Its Tokens
Oral
Sat Jun 06 08:25 AM -- 08:37 AM (PDT) @ Bluebird Ballroom None
PixelDiT: Pixel Diffusion Transformers for Image Generation
Oral
Sat Jun 06 08:25 AM -- 08:37 AM (PDT) @ Mile High Ballroom 3A - 4A None
MDCS-MoAME: Multi-directional Composite Scanning with Mixture of Attention and Mamba Experts for Cancer Survival Prediction
Oral
Sat Jun 06 08:25 AM -- 08:37 AM (PDT) @ Mile High Ballroom 1A - 2A None
ChordEdit: One-Step Low-Energy Transport for Image Editing
Oral
Sat Jun 06 08:25 AM -- 08:37 AM (PDT) @ Four Seasons Ballroom None
GeoViS: Geospatially Rewarded Visual Search for Remote Sensing Visual Grounding
Oral
Sat Jun 06 08:37 AM -- 08:50 AM (PDT) @ Mile High Ballroom 1A - 2A None
Faithful Contouring: Near-Lossless 3D Voxel Representation Free from Iso-surface
Oral
Sat Jun 06 08:37 AM -- 08:50 AM (PDT) @ Four Seasons Ballroom None
RobotSeg: A Model and Dataset for Segmenting Robots in Image and Video
Oral
Sat Jun 06 08:37 AM -- 08:50 AM (PDT) @ Bluebird Ballroom None
SeaCache: Spectral-Evolution-Aware Cache for Accelerating Diffusion Models
[
Slides]
Oral
Sat Jun 06 08:37 AM -- 08:50 AM (PDT) @ Mile High Ballroom 3A - 4A None
PAS: A Training-Free Stabilizer for Temporal Encoding in Video LLMs
Oral
Sat Jun 06 08:50 AM -- 09:02 AM (PDT) @ Mile High Ballroom 1A - 2A None
Native and Compact Structured Latents for 3D Generation
Oral
Sat Jun 06 08:50 AM -- 09:02 AM (PDT) @ Mile High Ballroom 3A - 4A None
PAVAS: Physics-Aware Video-to-Audio Synthesis
Oral
Sat Jun 06 08:50 AM -- 09:02 AM (PDT) @ Bluebird Ballroom None
SenCache: Accelerating Diffusion Model Inference via Sensitivity-Aware Caching
Oral
Sat Jun 06 08:50 AM -- 09:02 AM (PDT) @ Four Seasons Ballroom None
S^2AM3D: Scale-controllable Part Segmentation of 3D Point Clouds
Oral
Sat Jun 06 09:02 AM -- 09:15 AM (PDT) @ Bluebird Ballroom None
Streaming Diffusion Model for Fast Infrared and Visible Video Fusion
Oral
Sat Jun 06 09:02 AM -- 09:15 AM (PDT) @ Mile High Ballroom 3A - 4A None
ProPhy: Progressive Physical Alignment for Dynamic World Simulation
Oral
Sat Jun 06 09:02 AM -- 09:15 AM (PDT) @ Four Seasons Ballroom None
Scalable Multi-View Subspace Clustering with Tensorized Anchor Guidance
Oral
Sat Jun 06 09:02 AM -- 09:15 AM (PDT) @ Mile High Ballroom 1A - 2A None
SliderEdit: Continuous Image Editing with Fine-Grained Instruction Control
Break
Sat Jun 06 09:15 AM -- 09:30 AM (PDT) None
Courtesy Break
Keynote
Sat Jun 06 09:30 AM -- 10:30 AM (PDT) @ Bluebird Ballroom None
Transforming Computing with Quantum-Centric Supercomputing
Poster Setup
Sat Jun 06 10:15 AM -- 10:45 AM (PDT) @ ExHall A None
Poster Setup
Demonstration
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F None
Demos
Doctoral Consortium
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ 207 None
Doctoral Consortium (By invitation only)
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 1
Breaking Semantic Boundaries: Distribution-Guided Semantic Exploration for Creative Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 2
Guiding a Diffusion Model by Swapping Its Tokens
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 3
PixelDiT: Pixel Diffusion Transformers for Image Generation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 5
SenCache: Accelerating Diffusion Model Inference via Sensitivity-Aware Caching
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 6
Streaming Diffusion Model for Fast Infrared and Visible Video Fusion
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 7
ComPose: A Unified Completion-Pose Framework for Robust Category-Level Object Pose Estimation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 8
CoSMo3D: Open-World Promptable 3D Semantic Segmentation through LLM-Guided Canonical Spatial Modeling
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 9
GeoViS: Geospatially Rewarded Visual Search for Remote Sensing Visual Grounding
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 10
RobotSeg: A Model and Dataset for Segmenting Robots in Image and Video
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 11
S^2AM3D: Scale-controllable Part Segmentation of 3D Point Clouds
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 12
Scalable Multi-View Subspace Clustering with Tensorized Anchor Guidance
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 13
3D-LATTE: Latent Space 3D Editing from Textual Instructions
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 14
AnchorFlow: Training-Free 3D Editing via Latent Anchor-Aligned Flows
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 15
ChordEdit: One-Step Low-Energy Transport for Image Editing
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 16
Faithful Contouring: Near-Lossless 3D Voxel Representation Free from Iso-surface
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 17
Native and Compact Structured Latents for 3D Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 18
SliderEdit: Continuous Image Editing with Fine-Grained Instruction Control
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 19
Differentiable Vector Quantization for Rate-Distortion Optimization of Generative Image Compression
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 20
FINER: MLLMs Hallucinate under Fine-grained Negative Queries
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 21
MDCS-MoAME: Multi-directional Composite Scanning with Mixture of Attention and Mamba Experts for Cancer Survival Prediction
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 22
PAS: A Training-Free Stabilizer for Temporal Encoding in Video LLMs
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 23
PAVAS: Physics-Aware Video-to-Audio Synthesis
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 24
ProPhy: Progressive Physical Alignment for Dynamic World Simulation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 25
V-DPM: 4D Video Reconstruction with Dynamic Point Maps
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 26
Registration-Free Learnable Multi-View Capture of Faces in Dense Semantic Correspondence
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 27
Mesh4D: 4D Mesh Reconstruction and Tracking from Monocular Video
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 28
SPE-MVS: Spatial Position Encoding Enhanced Multi-View Stereo with Monocular Depth Priors
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 29
Block-Sparse Global Attention for Efficient Multi-View Geometry Transformers
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 30
SMVRT: Implicit Human 3D Modeling Using Sparse Multi-View Volumetric Reconstruction with Transformer Fusion
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 31
LiDAR Prompted Spatio-Temporal Multi-View Stereo for Autonomous Driving
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 32
Any4D: Unified Feed-Forward Metric 4D Reconstruction
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 33
Co-Me: Confidence Guided Token Merging for Visual Geometric Transformers
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 34
Point4Cast: Streaming Dynamic Scene Reconstruction and Forecasting
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 35
AMB3R: Accurate Feed-forward Metric-scale 3D Reconstruction with Backend
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 36
AlignPose: Generalizable 6D Pose Estimation via Multi-view Feature-metric Alignment
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 37
Parallelised Differentiable Straightest Geodesics for 3D Meshes
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 38
Geometry-Aligned and Anomaly-Aware Reconstruction for 3D Anomaly Detection
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 39
DVGT: Driving Visual Geometry Transformer
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 41
MoRE: 3D Visual Geometry Reconstruction Meets Mixture-of-Experts
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 42
Foundation Encoders Are All You Need for Preference-Aware Personalization
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 43
Where Culture Fades: Revealing the Cultural Gap in Text-to-Image Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 44
ThinkGen: Generalized Thinking for Visual Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 45
CoLoGen: Progressive Learning of Concept–Localization Duality for Unified Image Generation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 46
Talk2Move: Reinforcement Learning for Text-Instructed Object-Level Geometric Transformation in Scenes
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 47
When Safety Collides: Resolving Multi-Category Harmful Conflicts in Text-to-Image Diffusion via Adaptive Safety Guidance
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 48
PSR: Scaling Multi-Subject Personalized Image Generation with Pairwise Subject-Consistency Rewards
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 49
HBridge: H-Shape Bridging of Heterogeneous Experts for Unified Multimodal Understanding and Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 50
Multimodal Semantic Bias Mitigation for Diverse Text-To-3D Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 51
Visual Personalization Turing Test
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 52
Composing Concepts from Images and Videos via Concept-prompt Binding
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 53
Less is More: Data-Efficient Adaptation for Controllable Text-to-Video Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 54
Semantic Derivative Flow: Graph-Guided Diffusion for Controllable Instance Interactions
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 55
Improving Text-to-Image Generation with Intrinsic Self-Confidence Rewards
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 56
Hierarchical Enhancement of Semantic Priors for Disentangled Text-Driven Motion Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 57
Simpleposter: A Simple Baseline For Product Poster Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 58
Prompt Yourself: Awakening Textual Semantics in 1D Visual Tokenizers
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 59
SkyReels-Text: Fine-Grained Font-Controllable Text Editing for Poster Design
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 60
Image Generation from Contextually-Contradictory Prompts
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 61
PromptEnhancer: Taming Your Rewriter for Text-to-Image Generation via Fine-Grained Reward
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 62
Aligning Text, Images and 3D Structure Token-by-Token
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 63
RefTon: Reference person shot assist virtual Try-on
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 64
GaussianVision: Vision-Language Alignment from Compressed Image Representations using 2D Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 65
Copy-Transform-Paste: Zero-Shot Object-Object Alignment Guided by Vision-Language and Geometric Constraints
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 66
Gravitation-Driven Semantic Alignment for Text Video Retrieval
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 67
MoE-GRPO: Optimizing Mixture-of-Experts via Reinforcement Learning in Vision-Language Models
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 68
M^3KG-RAG: Multi-hop Multimodal Knowledge Graph-enhanced Retrieval-Augmented Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 69
Evolutionary Multimodal Reasoning via Hierarchical Semantic Representation for Intent Recognition
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 70
ReFAct: Empowering Multimodal Web Agents with Visual and Context Focusing
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 72
MR-RAG: Multimodal Relevance-Aware Retrieval-Augmented Generation for Medical Visual Question Answering
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 73
Decoupling Stability and Plasticity for Multi-Modal Test-Time Adaptation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 74
CUE: Concept-Aware Multi-Label Expansion to Mitigate Concept Confusion in Long-Tailed Learning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 75
Energy Waveify and Redistribution for Test-Time Adaptation: A Control System Perspective
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 76
CD-Buffer: Complementary Dual-Buffer Framework for Test-Time Adaptation in Adverse Weather Object Detection
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 77
CoFiDA-M: Concept-Aware Feature Modulation for Cross-Domain Adaptation with Image-Only Inference
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 78
Towards Multimodal Domain Generalization with Few Labels
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 79
Reclaiming Lost Text Layers for Source-Free Cross-Domain Few-Shot Learning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 80
Event6D: Event-based Novel Object 6D Pose Tracking
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 81
EV-CGNet: Co-visible Focused 3D-guided 2D Event Keypoint Detection Network
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 82
AE2VID: Event-based Video Reconstruction via Aperture Modulation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 83
From Contrast to Consistency: Rethinking Event-based Continuous-Time Optical Flow Estimation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 84
Spike-driven Discrete Aggregation for Event-based Object Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 85
x^2-Fusion: Cross-Modality and Cross-Dimension Flow Estimation in Event Edge Space
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 86
FloVerse: Floor Plan-Guided Multi-Modal Navigation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 87
TrajRAG: Retrieving Geometric-Semantic Experience for Zero-Shot Object Navigation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 88
History to Future: Evolving Agent with Experience and Thought for Zero-shot Vision-and-Language Navigation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 89
DreamSAC: Learning Hamiltonian World Models via Symmetry Exploration
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 90
Beyond Scanpaths: Graph-Based Gaze Simulation in Dynamic Scenes
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 91
CGL: Advancing Continual GUI Learning via Reinforcement Fine-Tuning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 92
Rethinking Visual Rearrangement from A Diffusion Perspective
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 93
APEX: A Decoupled Memory-based Explorer for Asynchronous Aerial Object Goal Navigation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 94
Bridging the 2D-3D Gap: A Hierarchical Semantic-Geometric Map for Vision Language Navigation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 95
InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction Graphs
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 96
When Robots Should Say ''I Don’t Know'': Benchmarking Abstention in Embodied Question Answering
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 97
RoboAgent: Chaining Basic Capabilities for Embodied Task Planning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 98
Towards Training-free Scene Text Editing
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 99
VINS-120K: Ultra High-Resolution Image Editing with A Large-Scale Dataset
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 100
ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 101
Charge: A Comprehensive Novel View Synthesis Benchmark and Dataset to Bind Them All
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 102
Region-Wise Correspondence Prediction between Manga Line Art Images
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 103
WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 104
I2I-Bench: A Comprehensive Benchmark Suite for Image-to-Image Editing Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 105
TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 106
Hermite Radial Basis Function for Surface Reconstruction via Differentiable Rendering
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 107
RF4D:Neural Radar Fields for Novel View Synthesis in Outdoor Dynamic Scenes
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 108
Voxify3D: Pixel Art Meets Volumetric Rendering
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 109
Node-RF: Learning Generalized Continuous Space-Time Scene Dynamics with Neural ODE-based NeRFs
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 110
FluidGaussian: Propagating Simulation-Based Uncertainty Toward Functionally-Intelligent 3D Reconstruction
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 111
GaussFusion: Improving 3D Reconstruction in the Wild with A Geometry-Informed Video Generator
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 112
LagerNVS: Latent Geometry for Fully Neural Real-time Novel View Synthesis
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 113
Turbo-GS: Accelerating 3D Gaussian Fitting for High-Resolution Radiance Fields
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 114
BiProLoRA: Bilevel Prompt LoRA for Real Scene Recovery
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 115
Degradation-Consistent Test-Time Adaptation for All-in-One Image Restoration
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 116
CanonCGT: Reference-Based Color Grading via Canonical Pivot Representation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 117
2-Shots in the Dark: Low-Light Denoising with Minimal Data Acquisition
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 118
Restore, Assess, Repeat: A Unified Framework for Iterative Image Restoration
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 119
It Takes Two: A Duet of Periodicity and Directionality for Burst Flicker Removal
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 120
Scan Clusters, Not Pixels: A Cluster-Centric Paradigm for Efficient Ultra-high-definition Image Restoration
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 121
Seeing Beyond 8bits: Subjective and Objective Quality Assessment of HDR-UGC Videos
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 122
Dynamic Exposure Burst Image Restoration
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 123
FAPE-IR: Frequency-Aware Planning and Execution Framework for All-in-One Image Restoration
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 124
ColorFLUX: A Structure-Color Decoupling Framework for Old Photo Colorization
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 125
VEMamba: Efficient Isotropic Reconstruction of Volume Electron Microscopy with Axial-Lateral Consistent Mamba
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 126
Anatomica: Localized Control over Geometric and Topological Properties for Anatomical Diffusion Models
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 127
EMGauss: Continuous Slice-to-3D Reconstruction via Dynamic Gaussian Modeling in Volume Electron Microscopy
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 128
Underground Plant Exploration: Non-Destructive 3D Root Assessment with GPR Based on Point Graph Neural Network
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 129
Uni-Encoder Meets Multi-Encoders: Representation Before Fusion for Brain Tumor Segmentation with Missing Modalities
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 130
MicroFM: Physics-guided Flow Matching for Isotropic Microscopy Reconstruction
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 131
Dynamic Stream Network for Combinatorial Explosion Problem in Deformable Medical Image Registration
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 133
Towards Robust Vision Transformers: Path Dependency Analysis and a Simple Two-Stage Adversarial Training
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 135
When CLIP Sees More, It Fights Back Harder: Multi-View Guided Adaptive Counterattacks for Test-Time Adversarial Robustness
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 136
Hidden Dangers of Compositional Generation: Diagnosing Semantic Safety Failures in Text-to-Image Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 137
VisiLock: Authorizing Instruction-based Image editing with Dual Score Distillation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 138
JANUS: A Lightweight Framework for Jailbreaking Text-to-Image Models via Distribution Optimization
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 139
GenBreak: Red Teaming Text-to-Image Generation Using Large Language Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 140
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 141
Generate, Analyze, and Refine: Training-Free Sound Source Localization via MLLM Meta-Reasoning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 142
MMCP-GEN: A Modality-Extensible Diffusion Language Model for Conditional Protein Sequence Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 143
Few-shot Acoustic Synthesis with Multimodal Flow Matching
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 144
CLIP-like Model as a Foundational Density Ratio Estimator
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 145
Learning What Matters: Prioritized Concept Learning via Relative Error-driven Sample Selection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 146
EgoAVU: Egocentric Audio-Visual Understanding
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 147
Dictionary-Aligned Concept Control for Safeguarding Multimodal LLMs
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 148
Multimodal Protein Language Models for Enzyme Kinetic Parameters: From Substrate Recognition to Conformational Adaptation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 149
Echoes Over Time: Unlocking Length Generalization in Video-to-Audio Generation Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 150
Adaptive Confidence Regularization for Multimodal Failure Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 151
Factorize, Reconstruct, Enhance: A Unified Framework for Multimodal Sentiment Analysis
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 152
PhenoYieldNet: Learning Crop-Aware Phenological Responses for Multi-Crop Yield Prediction
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 153
Conflict-Aware Adaptive Cross-Reconstruction for Multimodal Sentiment Analysis
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 154
EduDiag: A Benchmark for Educational Diagnostic Reasoning with Error Tracing and Correction on Large Multimodal Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 156
Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 157
ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 158
Cross-Modal Guided Visual Synthesis for Data-Efficient Multimodal Depression Recognition
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 159
AffordGrasp: Cross-Modal Diffusion for Affordance-Aware Grasp Synthesis
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 160
PAM: A Pose–Appearance–Motion Engine for Sim-to-Real HOI Video Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 161
AffordGen: Generating Diverse Demonstrations for Generalizable Object Manipulation with Affordance Correspondence
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 162
HandWorld: Hand-Centric Unified Video Action Generation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 163
HVG-3D: Bridging Real and Simulation Domains for 3D-Conditional Hand-Object Interaction Video Synthesis
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 164
ArtHOI: Taming Foundation Models for Monocular 4D Reconstruction of Hand-Articulated-Object Interactions
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 165
LAM: Language Articulated Object Modelers
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 166
Haptic Neural Fields: Bringing Tactile Interactions to 3D Rendered Scenes
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 167
Open-world Hand-Object Interaction Video Generation Based on Structure and Contact-aware Representation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 168
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 169
From Inpainting to Layer Decomposition: Repurposing Generative Inpainting Models for Image Layer Decomposition
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 170
Temporal Equilibrium MeanFlow: Bridging the Scale Gap for One-Step Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 171
PROMO: Promptable Outfitting for Efficient High-Fidelity Virtual Try-On
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 172
Harmony: Harmonizing Audio and Video Generation through Cross-Task Synergy
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 173
UniSER: A Foundation Model for Unified Soft Effects Removal
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 174
EffectMaker: Unifying Reasoning and Generation for Customized Visual Effect Creation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 175
Inference-time Physics Alignment of Video Generative Models with Latent World Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 176
SMRABooth: Subject and Motion Representation Alignment for Customized Video Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 177
Plenoptic Video Generation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 178
PyramidalWan: On Making Pretrained Video Model Pyramidal for Efficient Inference
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 179
AdapTok: Learning Adaptive and Temporally Causal Video Tokenization in a 1D Latent Space
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 180
OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 181
Flowception: Temporally Expansive Flow Matching for Video Generation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 182
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 183
Linear Image Generation by Synthesizing Exposure Brackets
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 184
Low-Resolution Editing is All You Need for High-Resolution Editing
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 185
UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 186
iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 187
VENI: Variational Encoder for Natural Illumination
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 188
SketchAssist: A Practical Assistant for Semantic Edits and Precise Local Redrawing
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 189
MultiShotMaster: A Controllable Multi-Shot Video Generation Framework
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 190
MoCha: End-to-End Video Character Replacement without Structural Guidance
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 191
Negative Binomial Variational Autoencoders for Overdispersed Latent Modeling
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 193
VOSR: A Vision-Only Generative Model for Image Super-Resolution
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 194
Dual Graph Regularized Deep Unfolding Network for Guided Depth Map Super-resolution
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 195
DUO-VSR: Dual-Stream Distillation for One-Step Video Super-Resolution
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 197
Gradient Knows Best: Mixed-Precision Quantization via Gradient-Guided Bit Allocation for Super-Resolution
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 198
Toward Real-world Infrared Image Super-Resolution: A Unified Autoregressive Framework and Benchmark Dataset
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 199
Next-Scale Autoregressive Models for Text-to-Motion Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 200
Push-and-Step: From RL-Based Balance Recovery to Physical Simulation of Dense Crowds
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 201
Iterative Closed-Loop Motion Synthesis for Scaling the Capabilities of Humanoid Control
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 202
RoMo: A Large-Scale, Richly Organized Dataset and Semantic Taxonomy for Human Motion Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 203
FrankenMotion: Part-level Human Motion Generation and Composition
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 204
HSI-GPT2: A Dual-Granularity Large Motion Reasoning Model with Diffusion Refinement for Human–Scene Interaction
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 205
SceMoS: Scene-Aware 3D Human Motion Synthesis by Planning with Geometry-Grounded Tokens
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 206
Progressive Guessing to Fixed Point: Rethinking Human Motion Prediction with Deep Equilibrium Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 207
Archon: A Unified Multimodal Model for Holistic Digital Human Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 208
ReMoGen: Real-time Human Interaction-to-Reaction Generation via Modular Learning from Diverse Data
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 209
Towards Motion Turing Test: Evaluating Human-Likeness in Humanoid Robots
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 210
PatchScene: Patch-based Voxel Diffusion Model for Large-Scale Scene Completion
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 211
Prototype-Guided Concept Erasure in Diffusion Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 212
Any2Any 3D Diffusion Models with Knowledge Transfer: A Radiotherapy Planning Study
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 213
CARD: Correlation Aware Restoration with Diffusion
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 214
DMAligner: Enhancing Image Alignment via Diffusion Model Based View Synthesis
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 215
DRiffusion: Draft-and-Refine Process Parallelizes Diffusion Models with Ease
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 216
Do Less, Achieve More: Do We Need Every-Step Optimization for RL Fine-tuning of Diffusion Models?
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 219
MMTIT-Bench: A Multilingual and Multi-Scenario Benchmark with Cognition–Perception–Reasoning Guided Text-Image Machine Translation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 220
M3DocDep: Multi-modal, Multi-page, Multi-document Dependency Chunking with Large Vision-Language Models
[
Slides]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 221
Towards Policy-Adaptive Image Guardrail: Benchmark and Method
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 222
Flat-Pack Bench: Evaluating Spatio-Temporal Understanding in Large Vision-Language Models through Furniture Assembly
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 223
TextFM: Robust Semi-dense Feature Matching with Language Guidance
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 224
What’s Wrong with Synthetic Data for Scene Text Recognition? A Strong Synthetic Engine with Diverse Simulations and Self-Evolution
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 225
Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 226
SJD-PAC: Accelerating Speculative Jacobi Decoding via Proactive Drafting and Adaptive Continuation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 227
Point Cloud as a Foreign Language for Multi-modal Large Language Model
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 228
Grounded 3D-Aware Spatial Vision-Language Modeling
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 229
SpatialTree: How Spatial Intelligence Branches Out in MLLMs
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 230
TerraScope: Pixel-Grounded Visual Reasoning for Earth Observation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 231
Beyond 3D VQAs: Injecting 3D Spatial Priors into Vision-Language Models for Enhanced Geometric Reasoning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 232
OpenVoxel: Training-Free Grouping and Captioning Voxels for Open-Vocabulary 3D Scene Understanding
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 233
BOP-ASK: Object-Interaction Reasoning for Vision-Language Models
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 235
Eliciting Complex Spatial Reasoning in MLLMs through Wide-Baseline Matching
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 236
REALM: An MLLM-Agent Framework for Open World 3D Reasoning Segmentation and Editing on Gaussian Splatting
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 237
From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 238
MVGGT: Multimodal Visual Geometry Grounded Transformer for Multiview 3D Referring Expression Segmentation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 239
SpaceMind: Camera-Guided Modality Fusion for Spatial Reasoning in Vision-Language Models
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 240
ReMatch: Boosting Representation through Matching for Multimodal Retrieval
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 241
RI-Mamba: Rotation-Invariant Mamba for Robust Text-to-Shape Retrieval
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 242
Revisiting F-measure Optimization in Multi-Label Classification: A Sampling-based Approach
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 243
Thinking Beyond Labels: Vocabulary-Free Fine-Grained Recognition using Reasoning-Augmented LMMs
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 244
WISER: Wider Search, Deeper Thinking, and Adaptive Fusion for Training-Free Zero-Shot Composed Image Retrieval
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 245
Modeling the Visual Ambiguity of Human Sketches
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 246
SATTC: Structure-Aware Label-Free Test-Time Calibration for Cross-Subject EEG-to-Image Retrieval
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 247
ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for Composed Image Retrieval
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 248
V^2-SAM: Marrying SAM2 with Multi-Prompt Experts for Cross-View Object Correspondence
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 249
WeaveTime: Streaming from Earlier Frames into Emergent Memory in VideoLLMs
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 250
Streaming Video Crime Anticipation with Spatio-Temporal Causal Reasoning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 251
Efficient Frame Selection for Long Video Understanding via Reinforcement Learning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 252
HieraMamba: Video Temporal Grounding via Hierarchical Anchor-Mamba Pooling
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 253
InternVideo-Next: Towards World-Understanding Video Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 254
Condensed Test-Time Adaptation of VLMs for Action Recognition
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 255
Test-time Ego-Exo-centric Adaptation for Action Anticipation via Multi-Label Prototype Growing and Dual-Clue Consistency
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 256
A Stitch in Time: Learning Procedural Workflow via Self-Supervised Plackett–Luce Ranking
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 257
SurgCoT: Advancing Spatiotemporal Reasoning in Surgical Videos through a Chain-of-Thought Benchmark
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 258
Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 259
Concept-Guided Fine-Tuning: Steering ViTs away from Spurious Correlations to Improve Robustness
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 260
Explaining Object Detectors via Collective Contribution of Pixels
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 261
Where MLLMs Attend and What They Rely On: Explaining Autoregressive Token Generation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 262
H-Sets: Hessian-Guided Discovery of Set-Level Feature Interactions in Image Classifiers
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 263
Evaluating Generative Models via One-Dimensional Code Distributions
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 264
TriDF: Evaluating Perception, Detection, and Hallucination for Interpretable DeepFake Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 265
BuildAnyPoint: 3D Building Structured Abstraction from Diverse Point Clouds
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 266
LiDAR-to-4DRadar Diffusion Bridge via Cross-Modal Alignment and Translation in Latent Space
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 267
Edges Compete for Trust: Group Relative Edge Optimization for Building Reconstruction from Point Clouds
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 268
Unsupervised Monocular 3D Keypoint Discovery from Multi-View Diffusion Priors
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 269
QD-PCQA: Quality-Aware Domain Adaptation for Point Cloud Quality Assessment
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 270
L3DR: 3D-aware LiDAR Diffusion and Rectification
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 271
Ghost-FWL: A Large-Scale Full-Waveform LiDAR Dataset for Ghost Detection and Removal
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 272
Ghosts in the Point Clouds: De-glaring LiDAR in the Transient Domain
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 273
MS^2Gait: A Multi-Scale Spatio-Temporal Fusion Network for LiDAR-based Gait Recognition
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 275
Learning to Identify Out-of-Distribution Objects for 3D LiDAR Anomaly Segmentation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 276
Dual-Level Confidence based Implicit Self-Refinement for Medical Visual Question Answering
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 277
FedMPT: Federated Multi-Label Prompt Tuning of Vision-Language Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 278
Rethinking Model Selection in VLM Through the Lens of Gromov-Wasserstein Distance
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 279
NTK-Guided Implicit Neural Teaching
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 281
Text-Printed Image: Bridging the Image-Text Modality Gap for Text-centric Training of Large Vision-Language Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 282
Harmonious Parameter Adaptation in Continual Visual Instruction Tuning for Safety-Aligned MLLMs
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 283
StructXLIP: Enhancing Vision-language Models with Multimodal Structural Cues
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 284
Same or Not? Enhancing Visual Perception in Vision-Language Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 285
Vector Prism: Animating Vector Graphics by Stratifying Semantic Structure
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 286
AssemblyBench: Physics-Aware Assembly of Complex Industrial Objects
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 287
Animator-Centric Skeleton Generation on Objects with Fine-Grained Details
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 288
Synthesizing Visual Concepts as Vision-Language Programs
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 289
Self-Consistency for LLM-Based Motion Trajectory Generation and Verification
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 290
Semantic Scale Space: A Framework for Controllable Image Abstraction
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 291
Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 292
DSFlash: Comprehensive Panoptic Scene Graph Generation in Realtime
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 293
SIF: Semantically In-Distribution Fingerprints for Large Vision-Language Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 294
Designing to Forget: Deep Semi-parametric Models for Unlearning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 295
Meta-FC: Meta-Learning with Feature Consistency for Robust and Generalizable Watermarking
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 296
PrivSynth: Alternating and Control-Based Optimization for Privacy and Utility in Synthetic Data
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 297
Neighbor-Aware Localized Concept Erasure in Text-to-Image Diffusion Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 298
EcoAlign: An Economically Rational Framework for Efficient LVLM Alignment
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 299
Activation Matters: Test-time Activated Negative Labels for OOD Detection with Vision-Language Models
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 300
A Polynomial Chaos Framework for Causal Discovery in Nonlinear Uncertain Systems
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 302
From Selection to Scheduling: Federated Geometry-Aware Correction Makes Exemplar Replay Work Better under Continual Dynamic Heterogeneity
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 303
Fine-Tuning Impairs the Balancedness of Foundation Models in Long-tailed Personalized Federated Learning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 304
Few-for-Many Personalized Federated Learning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 305
ProxyFL: A Proxy-Guided Framework for Federated Semi-Supervised Learning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 306
Domain Sensitive Federated Learning with Fisher-Informed Pruning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 307
SPARROW: Learning Spatial Precision and Temporal Referential Consistency in Pixel-Grounded Video MLLMs
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 308
Bridging Facial Understanding and Animation via Language Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 309
AR²-4FV: Anchored Referring and Re-identification for Long-Term Grounding in Fixed-View Videos
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 310
CVA: Context-aware Video-text Alignment for Video Temporal Grounding
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 311
OmniGround: A Comprehensive Spatio-Temporal Grounding Benchmark for Real-World Complex Scenarios
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 312
ST4R-Splat: Spatio-Temporal Referring Segmentation in 4D Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 314
Rejection Mixing: Fast Semantic Propagation of Mask Tokens for Efficient DLLM Inference
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 315
Towards Unified Human Perception and Machine Understanding: Token Flow Guided Compression Framework
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 316
A More Word-like Image Tokenization for MLLMs
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 317
DUET-VLM: Dual stage Unified Efficient Token reduction for VLM Training and Inference
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 318
Unified Spatiotemporal Token Compression for Video-LLMs at Ultra-Low Retention
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 319
One Layer’s Trash is Another Layer’s Treasure: Adaptive Layer-wise Visual Token Selection in LVLMs
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 320
OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 321
Tunable Soft Equivariance with Guarantees
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 322
Semi-Supervised Conformal Prediction With Unlabeled Nonconformity Score
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 323
Cluster-aware Anchor Learning for Multi-View Clustering
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 324
Revisiting Sparsity Constraint Under High-Rank Property in Partial Multi-Label Learning
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 325
Weight Space Representation Learning via Neural Field Adaptation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 326
Recurrent Video Masked Autoencoders
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 327
Revisiting Unknowns: Towards Effective and Efficient Open-Set Active Learning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 328
Seeing Through the Shift: Causality-Inspired Robust Generalized Category Discovery
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 329
From Exploration to Exploitation: A Two-Stage Entropy RLVR Approach for Noise-Tolerant MLLM Training
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 330
Spatial Retrieval Augmented Autonomous Driving
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 331
Scaling-Aware Data Selection for End-to-End Autonomous Driving Systems
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 332
ColaVLA: Leveraging Cognitive Latent Reasoning for Hierarchical Parallel Trajectory Planning in Autonomous Driving
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 333
CARD: A Multi-Modal Automotive Dataset for Dense 3D Reconstruction in Challenging Road Topography
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 334
MindDriver: Introducing Progressive Multimodal Reasoning for Autonomous Driving
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 335
WPT: World-to-Policy Transfer via Online World Model Distillation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 336
ClimaOoD: Improving Anomaly Segmentation via Physically Realistic Synthetic Data
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 337
Recover to Predict: Progressive Retrospective Learning for Variable-Length Trajectory Prediction
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 338
URScenes: A Multi-scenario Dataset for Unstructured Road Environments
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 339
MeanFuser: Fast One-Step Multi-Modal Trajectory Generation and Adaptive Reconstruction via MeanFlow for End-to-End Driving
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 340
SAMosaic3D: Modular Scene Assembly for Real-Time 3D Segment Anything
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 342
MV3DIS: Multi-View Mask Matching via 3D Guides for Zero-Shot 3D Instance Segmentation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 343
PEARL: Geometry Aligns Semantics for Training-Free Open-Vocabulary Semantic Segmentation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 344
RAVEN: Radar Adaptive Vision Encoders for Efficient Chirp-wise Object Detection and Segmentation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 345
SAMIX: Reinforcing SAM2 with Semantic Adapter and Reference Selecting Policy for Mix-Supervised Segmentation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 346
MARSS: Radar Semantic Segmentation via Modular Attention and State Space Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 347
MixerCSeg: An Efficient Mixer Architecture for Crack Segmentation via Decoupled Mamba Attention
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 348
Exemplar-Free Class Incremental Learning via Preserving Class-Discriminative Structure
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 349
Critical Patch-Aware Sparse Prompting with Decoupled Training for Continual Learning on the Edge
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 350
PACT: Phase-Like Transition Constraints in Adapter-Based Continual Learning of Vision-Language Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 351
Representation-Steered Incremental Adapter-Tuning for Class-Incremental Learning with Pre-Trained Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 352
Re-evaluating Continual VQA: Toward Fair and Robust Evaluation for Multimodal Continual Learning
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 354
Enhancing Continual Learning of Vision-Language Models via Dynamic Prefix Weighting
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 355
Beyond Myopic Alignment: Lookahead Optimization for Online Class-Incremental Learning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 356
EmoDiffTalk: Emotion-aware Diffusion for Editable 3D Gaussian Talking Head
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 357
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 358
D^3FER: Dual Channel and Dual Branch Network for Robust Facial Expression Recognition under Dual Challenges
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 359
HumanNOVA: Photorealistic, Universal and Rapid 3D Human Avatar Modeling from a Single Image
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 360
ExpPortrait: Expressive Portrait Generation via Personalized Representation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 361
PersonaLive! Expressive Portrait Image Animation for Live Streaming
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 362
ProFocus: Proactive Perception and Focused Reasoning in Vision-and-Language Navigation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 363
OptiMVMap: Offline Vectorized Map Construction via Optimal Multi-vehicle Perspectives
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 364
CogDriver: Integrating Cognitive Inertia for Temporally Coherent Planning in Autonomous Driving
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 365
TopoHR: Hierarchical Centerline Representation for Cyclic Topology Reasoning in Driving Scenes with Point-to-Instance Relations
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 366
AURA: Multi-modal Shared Autonomy for Urban Navigation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 367
Zero-Shot Reconstruction of Animatable 3D Avatars with Cloth Dynamics from a Single Image
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 368
FlexAvatar: Learning Complete 3D Head Avatars with Partial Supervision
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 369
Large-scale Codec Avatars: The Unreasonable Effectiveness of Large-scale Avatar Pretraining
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 370
UIKA: Fast Universal Head Avatar from Pose-Free Images
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 371
FlexAvatar: Flexible Large Reconstruction Model for Animatable Gaussian Head Avatars with Detailed Deformation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 372
First Logit Boosting: Visual Grounding Method to Mitigate Object Hallucination in Large Vision-Language Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 373
Locate-then-Sparsify: Attribution Guided Sparse Strategy for Visual Hallucination Mitigation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 374
Envision, Attend, Then Respond: Counterfactual Hallucination Mitigation in Large Vision-Language Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 375
PAS: Prelim Attention Score for Detecting Object Hallucinations in Large Vision-Language Models
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 376
MoD-DPO: Towards Mitigating Cross-modal Hallucinations in Omni LLMs using Modality Decoupled Preference Optimization
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 377
Fine-Grained Multi Image Object Hallucination Benchmark
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 378
Generative Video Motion Editing with 3D Point Tracks
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 379
BulletTime: Decoupled Control of Time and Camera Pose for Video Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 380
Learning to Generate Highly Dynamic Videos using Synthetic Motion Data
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 381
Stereo World Model: Camera-Guided Stereo Video Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 382
CG-Floor: Centroid-Guided Diffusion for Large-Scale Floorplan Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 383
MAD: Motion Appearance Decoupling for efficient Driving World Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 384
VDFE: Difference-Aware 3D Scene Editing with Non-Intrusive Video Diffusion Priors for Multi-View Consistency and Efficiency
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 385
Endless World: Real-Time 3D-Aware Long Video Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 386
SpatialDiff: 3D-Aware Object Movement via Implicit Spatial Modeling
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 387
Towards Realistic and Consistent Orbital Video Generation via 3D Foundation Priors
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 388
YOLO-ULM: Ultra-Lightweight Models for Real-Time Object Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 389
CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 390
YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 391
VLM4RSDet: Collaborative Optimization with Vision-Language Model for Enhancing Remote Sensing Object Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 392
WiTTA-Bench: Benchmarking Test-Time Adaptation for WiFi Sensing
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 393
MFEN: Multi-Frequency Expert Network for Visible-Infrared Person Re-ID
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 394
Object-Generalized Re-Identification: A Step Towards Universal Instance Perception
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 395
When Transformers Meet Mamba: A Hybrid Transformer-Mamba Network for Video Object Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 396
Prompt-Anchored Vision–Text Distillation for Lifelong Person Re-identification
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 397
HyperGait: Unleashing the Power of Parsing for Gait Recognition in the Wild via Hypergraph
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 398
Accelerating Streaming Video Large Language Models via Hierarchical Token Compression
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 399
Do You See What I Am Pointing At? Gesture-Based Egocentric Video Question Answering
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 400
Beyond Caption-Based Queries in Video Moment Retrieval
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 401
Neural-Centric Video Processing Pipeline for Unified Multi-Task Inference
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 402
VideoRealBench: A Chain-of-Thought Realism Evaluation Benchmark for Generated Human-Centric Videos
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 403
VAST: Video Ability‑Stratified Taxonomy for Data‑Efficient Video Reasoning
[
Slides]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 404
An Empirical Study on How Video-LLMs Answer Video Questions
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 405
FPSBench: A Benchmark for Video Understanding at High Frame Rates
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 406
UniComp: Rethinking Video Compression Through Informational Uniqueness
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 407
NaTex: Seamless Texture Generation as Latent Color Diffusion
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 408
Your Latent Mask is Wrong: Pixel-Equivalent Latent Compositing for Diffusion Models
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 409
Pluggable Pruning with Contiguous Layer Distillation for Diffusion Transformers
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 410
Attribute-Preserving Pseudo-Labeling for Diffusion-Based Face Swapping
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 411
Delta Rectified Flow Sampling for Text-to-Image Editing
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 412
Training-free Mixed-Resolution Latent Upsampling for Spatially Accelerated Diffusion Transformers
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 413
SpotEdit: Selective Region Editing in Diffusion Transformers
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 414
All-in-One Slider for Attribute Manipulation in Diffusion Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 415
DA-VAE: Plug-in Latent Compression for Diffusion via Detail Alignment
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 416
From Sketch to Fresco: Efficient Diffusion Transformer with Progressive Resolution
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 417
CATNet: Collaborative Alignment and Transformation Network for Cooperative Perception
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 418
Scene Reconstruction as Mapping Priors for 3D Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 419
CCF: Complementary Collaborative Fusion for Domain Generalized Multi-Modal 3D Object Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 420
Unleashing the Power of Chain-of-Prediction for Monocular 3D Object Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 421
R4Det: 4D Radar-Camera Fusion for High-Performance 3D Object Detection
[
Slides]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 422
Revisiting Token Compression for Accelerating ViT-based Sparse Multi-View 3D Object Detectors
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 423
Few-Shot Incremental 3D Object Detection in Dynamic Indoor Environments
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 424
Learning from Synthetic Data via Provenance-Based Input Gradient Guidance
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 425
Seeing Clearly, Reasoning Confidently: Plug-and-Play Remedies for Vision Language Model Blindness
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 427
R2G: A Multi-View Circuit Graph Benchmark Suite from RTL to GDSII
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 428
VQ-VA World: Towards High-Quality Visual Question-Visual Answering
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 430
Beyond Multiple Choice: Verifiable OpenQA for Robust Vision-Language RFT
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 431
See Further, Think Deeper: Advancing VLM's Reasoning Ability with Low-level Visual Cues and Reflection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 432
PDCR: Perception-Decomposed Confidence Reward for Vision-Language Reasoning
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 433
μVLM: A Vision Language Model for μNPUs
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 434
Gaussian Mapping for Evolving Scenes
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 435
Part-aware Modeling of Articulated Objects using 3D Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 436
AnchorSplat: Feed-Forward 3D Gaussian Splatting With 3D Geometric Priors
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 437
SGAD-SLAM: Splatting Gaussians at Adjusted Depth for Better Radiance Fields in RGBD SLAM
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 438
Faster-GS: Analyzing and Improving Gaussian Splatting Optimization
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 439
Layered 4D-Rotor Gaussian Splatting: A Compressed Representation for Long Dynamic Scenes
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 440
GaussianGrow: Geometry-aware Gaussian Growing from 3D Point Clouds with Text Guidance
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 441
PhysGS: Bayesian-Inferred Gaussian Splatting for Physical Property Estimation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 442
3D Gaussian Splatting at Arbitrary Resolutions with Compact Proxy Anchors
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 443
Stochastic Ray Tracing for the Reconstruction of 3D Gaussian Splatting
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 444
AeroDGS: Physically Consistent Dynamic Gaussian Splatting for Single-Sequence Aerial 4D Reconstruction
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 445
GaussianPile: A Unified Sparse Gaussian Splatting Framework for Slice-based Volumetric Reconstruction
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 446
More Natural, More Real: Object-aware Gaussian Splatting for 3D Visual Decoding from Human Brain
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 447
Eulerian Gaussian Splatting using Hashed Probability Pyramids
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 448
Confidence-Guided Multi-Scale Aggregation for Sparse-View High-Resolution 3D Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 449
ULF-Loc: Unbiased Landmark Feature for Robust Visual Localization with 3D Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 450
Robust3DGSW: Toward Robust Watermarking for Quantization-Aware 3D Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 451
ParkGaussian: Surround-view 3D Gaussian Splatting for Autonomous Parking
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 452
L^2DGS: Low-Light Dynamic Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 453
Probabilistic Concept Graph Reasoning for Multimodal Misinformation Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 454
POINTS-Long: Adaptive Dual-Mode Visual Reasoning in MLLMs
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 455
SegCompass: Exploring Interpretable Alignment with Sparse Autoencoders for Enhanced Reasoning Segmentation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 456
CRIT: Graph-Based Automatic Data Synthesis to Enhance Cross-Modal Multi-Hop Reasoning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 457
DeepScan: A Training-Free Framework for Visually Grounded Reasoning in Large Vision-Language Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 459
HUMORCHAIN: Theory-Guided Multi-Stage Reasoning for Interpretable Multimodal Humor Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 460
CodeDance: A Dynamic Tool-integrated MLLM for Executable Visual Reasoning
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 461
Rethinking MLLM Itself as a Segmenter with a Single Segmentation Token
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 462
Video-Only ToM: Enhancing Theory of Mind in Multimodal Large Language Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 463
Mario: Multimodal Graph Reasoning with Large Language Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 464
Boosting Reasoning in Large Multimodal Models via Activation Replay
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 465
Rationale-Enhanced Decoding for Multi-modal Chain-of-Thought
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 466
Mimic Human Cognition, Master Multi-Image Reasoning: A Meta-Action Framework for Enhanced Visual Understanding
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 467
ROSE: Rotate Your Large Language Model to See
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 468
OpenMMReasoner: Pushing the Frontiers in Multimodal Reasoning with an Open and General Recipe
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 470
Sparsity as a Key: Unlocking New Insights from Latent Structures for Out-of-Distribution Detection
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 471
SparVAR: Exploring Sparsity in Visual AutoRegressive Modeling for Training-Free Acceleration
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 472
Suppressing Non-Semantic Noise in Masked Image Modeling Representations
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 474
DeDelayed: Deleting Remote Inference Delay via On-Device Correction
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 476
Gaussian Splatting-based Low-Rank Tensor Representation for Multi-Dimensional Image Recovery
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 477
Precise Object and Effect Removal with Adaptive Target-Aware Attention
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 478
Decompose, Mix, Adapt: A Unified Framework for Parameter-Efficient Neural Network Recombination and Compression
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 479
FreqSIC: Frequency-aware Stereo Image Compression with Bi-directional Checkerboard Context Model
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 480
SinGeo: Unlock Single Model's Potential for Robust Cross-View Geo-Localization
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 481
Fusion of Depth and Semantics for Probabilistic Floorplan Localization
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 482
A2GC: Asymmetric Aggregation with Geometric Constraints for Locally Aggregated Descriptors
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 483
Geo2: Geometry-Guided Cross-view Geo-Localization and Image Synthesis
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 485
Resolving Evidence Sparsity: Agentic Context Engineering for Long-Document Understanding
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 486
Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 487
ORCA: Orchestrated Reasoning with Collaborative Agents for Document Visual Question Answering
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 488
MSJoE: Jointly Evolving MLLM and Sampler for Efficient Long-Form Video Understanding
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 489
A Multi-Agent Perception-Action Alliance for Efficient Long Video Reasoning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 490
Saliency-Guided Representation with Consistency Policy Learning for Visual Unsupervised Reinforcement Learning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 491
LensWalk: Agentic Video Understanding by Planning How You See in Videos
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 492
DPGF-Net: Dual-Prior Guided Fusion Network for Joint Assessment of Perceptual Quality and Semantic Consistency in AI-Generated Images
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 493
RegionFuse: Region-Adaptive Pixel Distribution Learning for Infrared and Visible Image Fusion
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 494
Missing No More: Dictionary-Guided Cross-Modal Image Fusion under Missing Infrared
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 495
VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 496
TAPE: Task-Adaptive Prototype Evolution in Audio-Language Models for Fully Few-shot Class-incremental Audio Classification
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 497
Remedying Target-Domain Astigmatism for Cross-Domain Few-Shot Object Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 499
Hyperbolic Defect Feature Synthesis for Few-Shot Defect Classification
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 500
Training-Only Heterogeneous Image-Patch-Text Graph Supervision for Advancing Few-Shot Learning Adapters
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 501
Learning to Learn Weight Generation via Local Consistency Diffusion
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 502
Balanced Dataset Distillation via Modeling Multiple Visual Pattern Distribution
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 503
Grid Distillation: Compositional Image Distillation via Structured Generative Grids
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 504
Dataset Distillation by Influence Matching
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 505
StableMaterials: Enhancing Diversity in Material Generation via Semi-Supervised Learning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 506
Seeing Through Blur: Tackling Defocus in Spike-Based Imaging
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 507
Distilling Quasi-Conformal Mapping: A Generalizable and Efficient Solution for Wide-Angle Correction
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 508
Lighting in Motion: Spatiotemporal HDR Lighting Estimation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 509
LightRR: A Lightweight Network for Single Image Reflection Removal
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 510
HFR and HDR Video from Multi-Attenuated Spikes Using a Rapidly Rotating SpokeND Filter
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 511
Coded-E2LF: Coded Aperture Light Field Imaging from Events
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 512
TokenLight: Precise Lighting Control in Images using Attribute Tokens
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 513
Kaleidoscopic Scintillation Event Imaging
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 514
gQIR: Generative Quanta Image Reconstruction
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 515
Solving Minimal Problems Without Matrix Inversion Using FFT-Based Interpolation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 516
Predicting Spatial Transcriptomics from Histology Images via High-Order Multi-Cell Interaction Modeling
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 517
From Spots to Pixels: Dense Spatial Gene Expression Prediction from Histology Images
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 519
LightSplat: Fast and Memory-Efficient Open-Vocabulary 3D Scene Understanding in Five Seconds
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 520
Guardians of the Hair: Rescuing Soft Boundaries in Depth, Stereo, and Novel Views
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 521
Zero-Shot Depth Completion with Vision-Language Model
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 522
FE2E: From Editor to Dense Geometry Estimator
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 523
Ego-1K – A Large-Scale Multiview Video Dataset for Egocentric Vision
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 524
Edit-As-Act: Goal-Regressive Planning for Open-Vocabulary 3D Indoor Scene Editing
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 525
VGGT-360: Geometry-Consistent Zero-Shot Panoramic Depth Estimation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 526
NI-Tex: Non-isometric Image-based Garment Texture Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 527
Velox: Learning Representations of 4D Geometry and Appearance
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 528
UniPixie: Unified and Probabilistic 3D Physics Learning via Flow Matching
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 529
UniTEX: Universal High Fidelity Generative Texturing for 3D Shapes
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 530
Points-to-3D: Structure-Aware 3D Generation with Point Cloud Priors
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 531
PartDiffuser: Part-wise 3D Mesh Generation via Discrete Diffusion
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 532
LoST: Level of Semantics Tokenization for 3D Shapes
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 533
Lafite: A Generative Latent Field for 3D Native Texturing
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 534
Image-Guided Geometric Stylization of 3D Meshes
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 535
LATTICE: Democratize High-Fidelity 3D Generation at Scale
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 536
Dehallu3D: Hallucination-Mitigated 3D Generation from a Single Image via Cyclic View Consistency Refinement
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 537
MeshMosaic: Scaling Artist Mesh Generation via Local-to-Global Assembly
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 538
TacSIm: A Dataset and Benchmark for Football Tactical Style Imitation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 539
DynamicsBoost: Dynamic Plausible Video Generation via Annotation-Free Continuation Preference Optimization
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 540
Reinforcement-Guided Synthetic Data Generation for Privacy-Sensitive Identity Recognition
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 541
Fine-Grained GRPO for Precise Preference Alignment in Flow Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 542
Lighting-grounded Video Generation with Renderer-based Agent Reasoning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 543
RewardFlow: Generate Images by Optimizing What You Reward
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 544
Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 545
Self-Corrected Image Generation with Explainable Latent Rewards
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 546
Polyphony: Diffusion-based Dual-Hand Action Segmentation with Alternating Vision Transformer and Semantic Conditioning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 547
Reading Your Actions: Learning Generalizable Action Representations via Pre-training AEMG
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 548
MA-Bench: Towards Fine-grained Micro-Action Understanding
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 551
DarkShake-DVS: Event-based Human Action Recognition under Low-light and Shaking Camera Conditions
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 552
Protect to Adapt: Subspace-Constrained Adaptation with Ranked Negative Prompt Feedback for Few-Shot Action Recognition
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 553
SkeletonContext: Skeleton-side Context Prompt Learning for Zero-Shot Skeleton-based Action Recognition
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 554
InTrain: Intrinsic Trainability for Zero-Cost Neural Architecture Search
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 555
S^2FT: Parameter-Efficient Fine-Tuning in Sparse Spectrum Domain
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 556
Rethinking SNN Online Training and Deployment: Gradient-Coherent Learning via Hybrid-Driven LIF Model
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 558
Towards Efficient Medical Reasoning with Minimal Fine-Tuning Data
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 559
AdaBet: Gradient-free Layer Selection for Efficient Training of Deep Neural Networks
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 560
TAS-LoRA: Transformer Architecture Search with Mixture-of-LoRA Experts
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 561
QuCNet: Quantum Deep Learning Driven Multi-Circuit Network for Remote Sensing Image Classification
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 562
Learning to Solve PDEs on Neural Shape Representations
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 563
Frequency Switching Mechanism for Parameter-Efficient Multi-Task Learning
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 564
Reconstructing Spiking Neural Networks Using a Single Neuron with Autapses
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 566
GUI-CEval: A Hierarchical and Comprehensive Chinese Benchmark for Mobile GUI Agents
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 567
FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 568
Streamlined Open-Vocabulary Human-Object Interaction Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 569
Decompose and Transfer: CoT-Prompting Enhanced Alignment for Open-Vocabulary Temporal Action Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 571
Boosting Quantitive and Spatial Awareness for Zero-Shot Object Counting
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 572
Parameter-Efficient Semantic Augmentation for Enhancing Open-Vocabulary Object Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 573
WeDetect: Fast Open-Vocabulary Object Detection as Retrieval
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 574
Open-Vocabulary Domain Generalization in Urban-Scene Segmentation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 575
OpenDPR: Open-Vocabulary Change Detection via Vision-Centric Diffusion-Guided Prototype Retrieval for Remote Sensing Imagery
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 576
Annotation-Efficient Coreset Selection for Context-dependent Segmentation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 577
ALLNet: Multi-task Dense Prediction for Degraded Images
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 578
Geometry-Aware Cross-Modal Graph Alignment for Referring Segmentation in 3D Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 580
GenMask: Adapting DiT for Segmentation via Direct Mask Generation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 581
Frequency-Aware Affinity for Weakly Supervised Semantic Segmentation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 582
Learning and Aligning Click-Aware Shape Prior for Interactive Amodal Instance Segmentation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 583
Beyond Reassembly: Fractured Object Recovery with Missing Parts
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 584
Best Segmentation Buddies for Image-Shape Correspondence
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 585
RMAE-ProGRess: Advancing Semantic Segmentation in Unstructured Environments
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 587
Orthogonal Spatial-Aware Multi-View Anchor Graph Clustering for Incomplete Remote Sensing Data
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 589
SkySense-VITA: Towards Universal In-context Segmentation of Multi-modal Remote Sensing Imagery
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 590
ProM3E: Probabilistic Masked MultiModal Embedding Model for Ecology
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 591
GeoCoT: Towards Reliable Remote Sensing Reasoning with Manifold Perspective
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 593
NeighborMAE: Exploiting Spatial Dependencies between Neighboring Earth Observation Images in Masked Autoencoders Pretraining
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 594
GeoDiT: A Diffusion-based Vision-Language Model for Geospatial Understanding
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 595
Balanced Hierarchical Contrastive Learning with Decoupled Queries for Fine-grained Object Detection in Remote Sensing Images
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 597
Improving Adversarial Transferability with Local Perturbation Augmentation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 598
Echoes of Ownership: Adversarial-Guided Dual Injection for Copyright Protection in MLLMs
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 599
Stealing Split Learning Bottom Models by Recovering Embedding Geometry
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 600
PoInit-of-View: Poisoning Initialization of Views Transfers Across Multiple 3D Reconstruction Systems
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 601
No Way To Steal My Face: Proactive Defense Against Identity-Preserving Personalized Generation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 602
Towards Reliable Evaluation of Adversarial Robustness for Spiking Neural Networks
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 603
Where, What, Why: Toward Explainable 3D-GS Watermarking
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 604
Robust Spiking Neural Networks by Temporal Mutual Information
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 605
TraceGen: World Modeling in 3D Trace Space Enables Learning from Cross-Embodiment Videos
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 606
HiF-VLA: Hindsight, Insight and Foresight through Motion Representation for Vision-Language-Action Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 607
AtomicVLA: Unlocking the Potential of Atomic Skill Learning in Robots
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 608
Obstruction Reasoning for Robotic Grasping
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 609
PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 610
CycleManip: Enabling Cycle-based Manipulation via Effective History Perception and Understanding
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 611
SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 612
Adaptive Action Chunking at Inference-time for Vision-Language-Action Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 613
Localizing, Structuring, and Rendering: Bridging 3D and 2D Vision-Language-Action Models for Robotic Manipulation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 614
NIL: No-data Imitation Learning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 615
Humanoid Generative Pre-Training for Zero-Shot Motion Tracking
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 616
EnergyAction: Unimanual to Bimanual Composition with Energy-Based Models
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 617
CUBic: Coordinated Unified Bimanual Perception and Control Framework
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 618
RehearseVLA: Simulated Post-Training for VLAs with Physically-Consistent World Model
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 619
GraspGen-X: Cross-Embodiment 6-DOF Diffusion-based Grasping
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 620
UETrack: A Unified and Efficient Framework for Single Object Tracking
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 621
ProgTrack: A Multi-Object Tracking Algorithm with Progressive Matching Strategy
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 622
Efficient Video Object Segmentation and Tracking with Recurrent Dynamic Submodel
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 623
Learning to Track Instance from Single Nature Language Description
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 624
MV-TAP: Tracking Any Point in Multi-View Videos
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 625
Adaptive Depth Lightweight RGB-T Tracking with Holistic Token Routing
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 626
Content-Adaptive Hierarchical Hyperprior for Neural Video Coding
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 627
UTPTrack: Towards Simple and Unified Token Pruning for Visual Tracking
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 628
Similarity-as-Evidence: Calibrating Overconfident VLMs for Interpretable and Label-Efficient Medical Active Learning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 629
From Infusion to Assimilation Distillation for Medical Image Segmentation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 631
Unlocking Positive Transfer in Incrementally Learning Surgical Instruments: A Self-reflection Hierarchical Prompt Framework
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 632
Keep It Frozen: Domain-Routed Conditional Residual Modulation for Multi-Domain Vision Transformers
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 634
MedLoc-R1: Performance-Aware Curriculum Reward Scheduling for GRPO-Based Medical Visual Grounding
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 635
Turning Pre-Trained Vision Transformers into End-to-End Histopathology Whole Slide Image Models for Survival Prediction
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 636
A Supervised Multi-task Framework for Joint cryo-ET Restoration Enabled by Generative Physical Simulation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 637
KAMP: Knowledge-Anchored Multimodal Pretraining Framework for Medical Image Representation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 638
CARE: A Molecular-Guided Foundation Model with Adaptive Region Modeling for Whole Slide Image Analysis
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 639
Contrastive Cross-Bag Augmentation for Multiple Instance Learning-based Whole Slide Image Classification
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 640
OmniFM: Toward Modality-Robust and Task-Agnostic Federated Learning for Heterogeneous Medical Imaging
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 641
Learning complete and explainable visual representations from itemized text supervision
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 642
EgoPoseFormer v2: Accurate Egocentric Human Motion Estimation for AR/VR
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 643
MetricHMSR: Metric Human Mesh and Scene Recovery from Monocular Images
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 644
Differentially Private 2D Human Pose Estimation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 645
TROPHIES: Temporal Reconstruction of Places, Humans, and Cameras from Multi-view Videos
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 646
PoseD-Flow: Versatile and Guided Flow Matching Model of Human Pose
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 648
HUMAPS-4D: A Multimodal Dataset for HUman Motion Analysis with Physiological and Semantic informations
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 649
PHASE-Net: Physics-Grounded Harmonic Attention System for Efficient Remote Photoplethysmography Measurement
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 650
LAMP: Localization Aware Multi-camera People Tracking in Metric 3D World
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 651
Expanding mmWave Datasets for Human Pose Estimation with Unlabeled Data and LiDAR Datasets
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 652
Towards Balanced Multi-Modal Learning in 3D Human Pose Estimation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 653
OMGTex: One-stage Multi-style Facial Texture Reconstruction without Geometry Guidance
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 654
Human Interaction-Aware 3D Reconstruction from a Single Image
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 655
Towards Generalizable AI-Generated Image Detection via Image-Adaptive Prompt Learning
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 656
SAGA: Source Attribution of Generative AI Videos
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 657
VMD-FACT: A New Video Dataset and MLLM-based method for Detecting Realistic AI-Generated Video Misinformation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 658
ReAlign: Generalizable Image Forgery Detection via Reasoning-Aligned Representation
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 659
A Sanity Check for Multi-In-Domain Face Forgery Detection in the Real World
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 660
PPM-CLIP: Probabilistic Prompt Modeling for Generalizable AI-Generated Image Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 661
Learning from Noisy Supervision: A Denoising–Debiasing Framework for Weakly Supervised Video Anomaly Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 662
Anomaly as Non-Conformity via Training-Free Graph Laplacian Energy Minimization
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 663
VisualAD: Language-Free Zero-Shot Anomaly Detection via Vision Transformer
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 664
CHAL: Causal-guided Hierarchical Anomaly-aware Learning for Moving Infrared Small Target Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 665
RAID: Retrieval-Augmented Anomaly Detection
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 666
ADSeeker: A Knowledge-Grounded Reasoning Framework for Industry Anomaly Detection and Reasoning
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 668
QueryOcc: Query-based Self-Supervision for 3D Semantic Occupancy
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 669
GSV2X: Geometry-Aware Uncertainty Modeling and Orthogonal Fusion for Robust Roadside Perception
[
Poster]
Poster
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F 670
Grounded Latents for Entity-Centric 4D Scene Generation
[
Poster]
Poster Session
Sat Jun 06 10:45 AM -- 12:45 PM (PDT) @ ExHall F None
Poster Session 3 & Exhibit Hall
Art Program
Sat Jun 06 10:45 AM -- 05:00 PM (PDT) @ ExHall F None
Art Exhibition
Art Program
Sat Jun 06 10:45 AM -- 11:15 AM (PDT) @ ExHall F None
Art Gallery Tour with Curator and Artists
Art Program
Sat Jun 06 12:45 PM -- 01:45 PM (PDT) @ Room 201 None
Art Panel
Oral
Sat Jun 06 01:00 PM -- 01:12 PM (PDT) @ Four Seasons Ballroom None
CodeV: Code with Images for Faithful Visual Reasoning via Tool-Aware Policy Optimization
Oral
Sat Jun 06 01:00 PM -- 01:12 PM (PDT) @ Bluebird Ballroom None
Chorus: Multi-Teacher Pretraining for Holistic 3D Gaussian Scene Encoding
Oral
Sat Jun 06 01:00 PM -- 01:12 PM (PDT) @ Mile High Ballroom 1A - 2A None
Breaking the Scalability Limit of Multi-Projector Calibration with Embedded Cameras
Oral
Sat Jun 06 01:00 PM -- 01:12 PM (PDT) @ Mile High Ballroom 3A - 4A None
INSID3: Training-Free In-Context Segmentation with DINOv3
Oral Session
Sat Jun 06 01:00 PM -- 02:15 PM (PDT) @ Mile High Ballroom 3A - 4A None
Oral Session 4D: Visual Segmentation
Oral Session
Sat Jun 06 01:00 PM -- 02:15 PM (PDT) @ Four Seasons Ballroom None
Oral Session 4B: Embodied & Agentic Intelligence
Oral Session
Sat Jun 06 01:00 PM -- 02:15 PM (PDT) @ Mile High Ballroom 1A - 2A None
Oral Session 4C: Spatial Reasoning
Oral Session
Sat Jun 06 01:00 PM -- 02:15 PM (PDT) @ Bluebird Ballroom None
Oral Session 4A: Geometric Understanding
Oral
Sat Jun 06 01:12 PM -- 01:25 PM (PDT) @ Four Seasons Ballroom None
NitroGen: An Open Foundation Model for Generalist Gaming Agents
Oral
Sat Jun 06 01:12 PM -- 01:25 PM (PDT) @ Mile High Ballroom 1A - 2A None
GaussianFluent: Gaussian Simulation for Dynamic Scenes with Mixed Materials
Oral
Sat Jun 06 01:12 PM -- 01:25 PM (PDT) @ Bluebird Ballroom None
Featurising Pixels from Dynamic 3D Scenes with Linear In-Context Learners
Oral
Sat Jun 06 01:12 PM -- 01:25 PM (PDT) @ Mile High Ballroom 3A - 4A None
MARCO: Navigating the Unseen Space of Semantic Correspondence
Oral
Sat Jun 06 01:25 PM -- 01:37 PM (PDT) @ Mile High Ballroom 1A - 2A None
InfiniBench: Infinite Benchmarking for Visual Spatial Reasoning with Customizable Scene Complexity
Oral
Sat Jun 06 01:25 PM -- 01:37 PM (PDT) @ Bluebird Ballroom None
From Pairs to Sequences: Track-Aware Policy Gradients for Keypoint Detection
Oral
Sat Jun 06 01:25 PM -- 01:37 PM (PDT) @ Four Seasons Ballroom None
PAI-Bench: A Comprehensive Benchmark For Physical AI
Oral
Sat Jun 06 01:25 PM -- 01:37 PM (PDT) @ Mile High Ballroom 3A - 4A None
PR-MaGIC: Prompt Refinement Via Mask Decoder Gradient Flow For In-Context Segmentation
Oral
Sat Jun 06 01:37 PM -- 01:50 PM (PDT) @ Mile High Ballroom 3A - 4A None
R^2-Seg: Training-Free OOD Medical Tumor Segmentation via Anatomical Reasoning and Statistical Rejection
Oral
Sat Jun 06 01:37 PM -- 01:50 PM (PDT) @ Mile High Ballroom 1A - 2A None
MAGICIAN: Efficient Long-Term Planning with Imagined Gaussians for Active Mapping
Oral
Sat Jun 06 01:37 PM -- 01:50 PM (PDT) @ Four Seasons Ballroom None
RefAV: Towards Planning-Centric Scenario Mining
Oral
Sat Jun 06 01:37 PM -- 01:50 PM (PDT) @ Bluebird Ballroom None
Linear Fundamental Matrix Estimation from 7 or 5 Points
Oral
Sat Jun 06 01:50 PM -- 02:02 PM (PDT) @ Bluebird Ballroom None
OccuFly: A 3D Vision Benchmark for Semantic Scene Completion from the Aerial Perspective
Oral
Sat Jun 06 01:50 PM -- 02:02 PM (PDT) @ Mile High Ballroom 1A - 2A None
Memory-Augmented Scene Understanding and Exploration for Open-World Aerial Object-Goal Navigation
Oral
Sat Jun 06 01:50 PM -- 02:02 PM (PDT) @ Four Seasons Ballroom None
SoccerMaster: A Vision Foundation Model for Soccer Understanding
Oral
Sat Jun 06 01:50 PM -- 02:02 PM (PDT) @ Mile High Ballroom 3A - 4A None
The SA-FARI Dataset: Segment Anything in Footage of Animals for Recognition and Identification
Oral
Sat Jun 06 02:02 PM -- 02:15 PM (PDT) @ Mile High Ballroom 3A - 4A None
VGGT-Segmentor: Geometry-Enhanced Cross-View Segmentation
Oral
Sat Jun 06 02:02 PM -- 02:15 PM (PDT) @ Mile High Ballroom 1A - 2A None
Monocular Open Vocabulary Occupancy Prediction for Indoor Scenes
Oral
Sat Jun 06 02:02 PM -- 02:15 PM (PDT) @ Four Seasons Ballroom None
VS-Bench: Evaluating VLMs for Strategic Abilities in Multi-Agent Environments
Oral
Sat Jun 06 02:02 PM -- 02:15 PM (PDT) @ Bluebird Ballroom None
VGGT-Ω
Break
Sat Jun 06 02:15 PM -- 02:30 PM (PDT) None
Courtesy Break
Poster Setup
Sat Jun 06 03:15 PM -- 03:45 PM (PDT) @ ExHall A None
Poster Setup
Demonstration
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall F None
Demos
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 1
Chorus: Multi-Teacher Pretraining for Holistic 3D Gaussian Scene Encoding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 2
Featurising Pixels from Dynamic 3D Scenes with Linear In-Context Learners
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 3
From Pairs to Sequences: Track-Aware Policy Gradients for Keypoint Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 4
Linear Fundamental Matrix Estimation from 7 or 5 Points
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 5
OccuFly: A 3D Vision Benchmark for Semantic Scene Completion from the Aerial Perspective
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 6
VGGT-Ω
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 7
CodeV: Code with Images for Faithful Visual Reasoning via Tool-Aware Policy Optimization
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 8
NitroGen: An Open Foundation Model for Generalist Gaming Agents
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 9
PAI-Bench: A Comprehensive Benchmark For Physical AI
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 10
RefAV: Towards Planning-Centric Scenario Mining
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 11
SoccerMaster: A Vision Foundation Model for Soccer Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 12
VS-Bench: Evaluating VLMs for Strategic Abilities in Multi-Agent Environments
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 13
Breaking the Scalability Limit of Multi-Projector Calibration with Embedded Cameras
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 14
GaussianFluent: Gaussian Simulation for Dynamic Scenes with Mixed Materials
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 15
InfiniBench: Infinite Benchmarking for Visual Spatial Reasoning with Customizable Scene Complexity
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 16
MAGICIAN: Efficient Long-Term Planning with Imagined Gaussians for Active Mapping
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 17
Memory-Augmented Scene Understanding and Exploration for Open-World Aerial Object-Goal Navigation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 18
Monocular Open Vocabulary Occupancy Prediction for Indoor Scenes
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 19
INSID3: Training-Free In-Context Segmentation with DINOv3
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 20
MARCO: Navigating the Unseen Space of Semantic Correspondence
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 21
PR-MaGIC: Prompt Refinement Via Mask Decoder Gradient Flow For In-Context Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 22
R^2-Seg: Training-Free OOD Medical Tumor Segmentation via Anatomical Reasoning and Statistical Rejection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 23
The SA-FARI Dataset: Segment Anything in Footage of Animals for Recognition and Identification
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 24
VGGT-Segmentor: Geometry-Enhanced Cross-View Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 25
DAGE: Dual-Stream Architecture for Efficient and Fine-Grained Geometry Estimation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 26
Wave-Former: Through-Occlusion 3D Reconstruction via Wireless Shape Completion
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 27
Lite Any Stereo: Efficient Zero-Shot Stereo Matching
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 28
MuM: Multi-View Masked Image Modeling for 3D Vision
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 29
ZipMap: Linear-Time Stateful 3D Reconstruction via Test-Time Training
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 30
Scal3R: Scalable Test-Time Training for Large-Scale 3D Reconstruction
[
Slides]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 31
LaRP: Efficient Multi-View Inpainting with Latent Reprojection Priors
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 32
TopoMA: Topology-Guided Multi-Agent Dense RGB 3D Reconstruction via Distributed Inference
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 33
Sparse–View Localization via Online Neural 3D Regression
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 34
Dynamic Visual SLAM using a General 3D Prior
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 35
Learning Scene Coordinate Reconstruction from Unposed Images via Pose Graph Optimization
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 36
FlashVGGT: Efficient and Scalable Visual Geometry Transformers with Compressed Descriptor Attention
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 37
No Calibration, No Depth, No Problem: Cross-Sensor View Synthesis with 3D Consistency
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 38
UFO: Unifying Feed-Forward and Optimization-based Methods for Large Driving Scene Modeling
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 39
Reliev3R: Relieving Feed-forward 3D Reconstruction from Multi-View Geometric Annotations
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 40
TALO: Pushing 3D Vision Foundation Models Towards Globally Consistent Online Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 41
Global Structure-from-Motion Meets Feedforward Reconstruction
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 42
POCA: Pareto-Optimal Curriculum Alignment for Visual Text Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 43
DuoGen: Towards Autonomous Interleaved Multimodal Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 44
Vibe Spaces for Creatively Connecting and Expressing Visual Concepts
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 45
StoryTailor:A Zero-Shot Pipeline for Action-Rich Multi-Subject Visual Narratives
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 46
CREward: A Type-Specific Creativity Reward Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 47
LumiX: Structured and Coherent Text-to-Intrinsic Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 48
Synthetic Curriculum Reinforces Compositional Text-to-Image Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 49
OmniGen2: Towards Instruction-Aligned Multimodal Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 50
Selectively Extracting and Injecting Visual Attributes into Text-to-Image Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 51
LoFA: Learning to Predict Personalized Prior for Fast Adaptation of Visual Generative Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 52
UniVerse: Empower Unified Generation with Reasoning and Knowledge
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 53
UniVerse: A Unified Modulation Framework for Segmentation-Free, Disentangled Multi-Concept Personalization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 54
Residual Decoder Adapter: ID-Preserving Tokenizer Adaption for Autoregressive Text Rendering
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 55
TGT: Text-Grounded Trajectories for Locally Controlled Video Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 56
RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free Text-to-Image Alignment
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 57
FlowFixer: Towards Detail-Preserving Subject-Driven Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 58
TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 59
UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 60
FEAT: Fashion Editing and Try-On from Any Design
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 61
Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 62
PointAlign: Feature-Level Alignment Regularization for 3D Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 63
PowerCLIP: Powerset Alignment for Contrastive Pre-Training
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 64
MoBind: Motion Binding for Fine-Grained IMU–Video Pose Alignment
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 66
Tackling Model Bias via Game-theoretic Multi-agent Collaboration Framework for Hateful Meme Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 67
CCCaption: Dual-Reward Reinforcement Learning for Complete and Correct Image Captioning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 68
MM-ReCoder: Advancing Chart-to-Code Generation with Reinforcement Learning and Self-Correction
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 69
Learning to Generate via Understanding: Understanding-Driven Intrinsic Rewarding for Unified Multimodal Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 70
Hierarchical Process Reward Models are Symbolic Vision Learners
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 71
ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 73
AcTTA: Rethinking Test-Time Adaptation via Dynamic Activation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 74
Reframing Long-Tailed Learning via Loss Landscape Geometry
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 75
Cleaning the Pool: Progressive Filtering of Unlabeled Pools in Deep Active Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 76
DC-Merge: Improving Model Merging with Directional Consistency
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 77
TALON: Test-time Adaptive Learning for On-the-Fly Category Discovery
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 78
Event-Illumination Collaborative Low-light Image Enhancement with a High-resolution Real-world Dataset
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 79
NEC-Diff: Noise-Robust Event-RAW Complementary Diffusion for Seeing Motion in Extreme Darkness
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 80
Towards Persistence: Learning Topological Constraints for Event-based Small Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 81
Geometric-Photometric Event-based 3D Gaussian Ray Tracing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 82
EventDrive: Event Cameras for Vision-Language Driving Intelligence
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 83
EventGait: Towards Robust Gait Recognition with Event Streams
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 84
MergeVLA: Cross-Skill Model Merging Toward a Generalist Vision-Language-Action Agent
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 85
Resolving the Stability-Plasticity Dilemma in Reinforcement Learning via Complementary Continual Critics
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 86
SAGE: Scalable Agentic 3D Scene Generation for Embodied AI
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 87
Semantic Audio-Visual Navigation in Continuous Environments
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 88
Unifying Perception and Action: A Hybrid-Modality Pipeline with Implicit Visual Chain-of-Thought for Robotic Action Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 89
FLARE: A Failure-Aware Framework for Autonomous Correction and Recovery in Visual-Language Robotic Manipulation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 90
Learning to Adapt: Self-Improving Web Agent via Cognitive-Aware Exploration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 91
General Process Reward Modeling for Robotic Reinforcement Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 93
Action-Sketcher: From Reasoning to Action via Visual Sketches for Robotic Manipulation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 94
Thinking in 360°: Humanoid Visual Search in the Wild
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 95
Learning from Semantic Dictionaries: Discriminative Codebook Contrastive Learning for Unified Visual Representation and Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 96
MagicQuill V2: Precise and Interactive Image Editing with Layered Visual Cues
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 97
Cycle-Consistent Tuning for Layered Image Decomposition
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 98
RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 99
Beyond Objects: Contextual Synthetic Data Generation for Fine-Grained Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 100
NEAF: Natural Image Editing with Attention Fusion for Generalizable Test-time Optimization in Text-Guided Image Editing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 101
OntoAug: Rethinking Generative Data Augmentation via Ontology Guidance
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 102
Spherical Voronoi: Directional Appearance as a Differentiable Partition of the Sphere
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 103
4DSurf: High-Fidelity Dynamic Scene Surface Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 104
Learning 3D Representations for Spatial Intelligence from Unposed Multi-View Images
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 105
Depth Peeling for High-Fidelity Gaussian-Enhanced Surfel Rendering
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 106
Intrinsic Image Fusion for Multi-View 3D Material Reconstruction
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 107
PackUV: Packed Gaussian UV Maps for 4D Volumetric Video
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 108
Opti-NeuS: Neural Reconstruction for Dual-Layered Transparent and Opaque Objects
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 109
PhysGaia: A Physics-aware Benchmark with Multi-Body Interactions for Dynamic Novel View Synthesis
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 110
MatSpray: Fusing 2D Material World Knowledge on 3D Geometry
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 111
OMoBlur: An Object Motion Blur Dataset and Benchmark for Real-World Local Motion Deblurring
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 112
Hybrid Agents for Image Restoration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 113
Zero-Shot Image Denoising via Hybrid Prior-Guided Pseudo Sample Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 114
Self-supervised Dynamic Heterogeneous Degradation Modeling for Unified Zero-Shot Image Restoration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 115
Next-Scale Prediction: A Self-Supervised Approach for Real-World Image Denoising
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 116
PhaSR: Generalized Image Shadow Removal with Physically Aligned Priors
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 117
UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 118
FastGaMer: Efficient GainMap Learning for Practical Inverse Tone Mapping
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 119
MDS-VQA: Model-Informed Data Selection for Video Quality Assessment
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 120
Seeing through Light and Darkness: Sensor-Physics Grounded Deblurring HDR NeRF from Single-Exposure Images and Events
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 121
Disentanglement-wise Image Dehazing through Cross-Domain Manifold Consensus
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 122
Unsupervised Multi-Scale Segmentation of 3D Subcellular World with Stable Diffusion Foundation Model
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 123
EchoPOSE: 6D Pose Estimation of Sparse Echocardiograms for Left-Ventricular 3D Shape Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 124
Spatial-SAM: Spatially Consistent 3D Electron Microscopy Segmentation with SDF Memory and Semi-Supervised Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 125
LLaDA-MedV: Exploring Large Language Diffusion Models for Biomedical Image Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 126
TAlignDiff: Automatic Tooth Alignment assisted by Diffusion-based Transformation Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 127
Harmonized Feature Conditioning and Frequency-Prompt Personalization for Multi-Rater Medical Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 128
Masked-Diffusion Autoencoders for 3D Medical Vision Representation Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 129
PGR-Net: Prior-Guided ROI Reasoning Network for Brain Tumor MRI Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 130
Test-Time Attention Purification for Backdoored Large Vision Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 131
AGFT: Alignment-Guided Fine-Tuning for Zero-Shot Adversarial Robustness of Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 132
Towards Robust Multimodal Large Language Models Against Jailbreak Attacks
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 133
R^2TUA: Reconstruction-residual Based Targeted and Untargeted Attack Against Text-Image Person Re-Identification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 134
When Robots Obey the Patch: Universal Transferable Patch Attacks on Vision-Language-Action Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 135
FlowHijack: A Dynamics-Aware Backdoor Attack on Flow-Matching Vision-Language-Action Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 136
Principled Steering via Null-space Projection for Jailbreak Defense in Vision-Language Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 137
Enhancing Part-Level Point Grounding for Any Open-Source MLLMs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 138
MeteorPred: A Meteorological Multimodal Large Model and Dataset for Severe Weather Event Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 139
YieldSAT: A Multimodal Benchmark Dataset for High-Resolution Crop Yield Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 140
How Far Can We Go With Synthetic Data for Audio-Visual Sound Source Localization?
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 141
Modeling Cross-vision Synergy for Unified Large Vision Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 142
Beyond Missing Modalities: Hypergraph Conditioned Diffusion for Uncertainty-Aware Multimodal Emotion Recognition
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 143
Rosetta Stone For Unified MLLMs: A Unified Tokenizer to Decipher Understanding and Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 144
MOON2.0: Dynamic Modality-balanced Multimodal Representation Learning for E-commerce Product Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 145
Nano-EmoX: Unifying Multimodal Emotional Intelligence from Perception to Empathy
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 146
AMusE: Audio-Visual Benchmark and Alignment Framework for Agentic Multi-Speaker Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 147
Prototype-as-Prompt: Multimodal Sentiment Prototypes Endowing Large Language Models the Capability to Perform Multimodal Sentiment Analysis
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 148
CF-IPT: Cross-Modal Fusion Interactive Prompt Tuning of Vision-Language Pre-Trained Model for Multisource Remote Sensing Data Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 149
EMAD: Evidence-Centric Grounded Multimodal Diagnosis for Alzheimer’s Disease
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 150
Multimodal Learning on Low-Quality Data with Conformal Predictive Self-Calibration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 151
Cross-View Distillation and Adaptive Masking for Incomplete Multi-View Multi-Label Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 152
Bootstrap Your Own AV-Proxies: Adaptive Contrastive and Prototype Learning for Audio-Visual Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 154
M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 155
Text-Driven 3D Hand Motion Generation from Sign Language Data
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 156
Real2Edit2Real: Generating Robotic Demonstrations via a 3D Control Interface
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 157
GenHOI: Towards Object-Consistent Hand–Object Interaction with Temporally Balanced and Spatially Selective Object Injection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 158
Clay-to-Stone: Phase-wise 3D Gaussian Splatting for Monocular Articulated Hand-Object Manipulation Modeling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 159
Training-free Motion Factorization for Compositional Video Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 160
Audio-sync Video Instance Editing with Granularity-Aware Mask Refiner
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 161
CaTok: Taming Mean Flows for One-Dimensional Causal Image Tokenization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 162
FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 163
V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 164
PoseAnything: General Pose-guided Video Generation with Part-aware Temporal Coherence
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 165
FastHybrid: Accelerating Hybrid Autoregressive Image Generation with Lookahead and Guided Decoding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 166
DPAR: Dynamic Patchification for Efficient Autoregressive Visual Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 167
AlcheMinT: Fine-grained Temporal Control for Multi-Reference Consistent Video Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 168
LeapAlign: Post-training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 169
EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 170
Flow Matching for Multimodal Distributions
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 171
From Scale to Speed: Adaptive Test-Time Scaling for Image Editing
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 172
ReasonEdit: Towards Reasoning-Enhanced Image Editing Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 173
Cross-Subject EEG-to-Video Reconstruction and Beyond
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 174
Rethinking Position Embedding as a Context Controller for Multi-Reference and Multi-Shot Video Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 175
Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 176
BiFM: Bidirectional Flow Matching for Few-Step Image Editing and Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 177
DTG-Restore: Training-Free Diffusion Refinement for Generative Video Super-Resolution
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 178
VABench: A Comprehensive Benchmark for Audio-Video Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 179
Relightful Video Portrait Harmonization
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 180
DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 181
DVAR: Dynamic Visual Autoregressive Modeling for Image Super-Resolution
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 182
Gated Condition Injection without Multimodal Attention: Towards Controllable Linear-Attention Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 184
UCAN: Unified Convolutional Attention Network for Expansive Receptive Fields in Lightweight Super-Resolution
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 186
RAW-Domain Degradation Models for Realistic Smartphone Super-Resolution
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 187
One-Step Diffusion Transformer for Controllable Real-World Image Super-Resolution
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 188
FRAMER: Frequency-Aligned Self-Distillation with Adaptive Modulation Leveraging Diffusion Priors for Real-World Image Super-Resolution
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 189
HDW-SR: High-Frequency Guided Diffusion Model based on Wavelet Decomposition for Image Super-Resolution
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 190
Unifying Precise Keyframes and Semantic Control via Multi-level Diffusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 191
CIGPose: Causal Intervention Graph Neural Network for Whole-Body Pose Estimation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 192
Pressure2Motion: Hierarchical Human Motion Reconstruction from Ground Pressure with Text Guidance
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 193
From 3D Pose to Prose: Biomechanics-Grounded Vision–Language Coaching
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 194
InterPrior: Scaling Generative Control for Physics-Based Human-Object Interactions
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 195
MoCoDiff: A Controllable Autoregressive Diffusion Model for Expressive Motion Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 196
W2W: Language-Model-Based Trajectory Prediction with Reinforcement Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 197
ParTY: Part-Guidance for Expressive Text-to-Motion Synthesis
[
Slides]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 198
Interact2Ar: Full-Body Human-Human Interaction Generation via Autoregressive Diffusion Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 199
Unified Number-Free Text-to-Motion Generation Via Flow Matching
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 200
Generative Diffusion Priors for 3D Mapping of the Dark Universe
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 201
FlowPalm: Optical Flow Driven Non-Rigid Deformation for Geometrically Diverse Palmprint Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 202
DiffuView: Multi-View Diffusion Pretraining for 3D Aware Robotic Manipulation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 203
Circuit Mechanisms for Spatial Relation Generation in Diffusion Transformers
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 204
Dual Ascent Diffusion for Inverse Problems
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 205
Forecast the Principal, Stabilize the Residual: Subspace-Aware Feature Caching for Diffusion Transformers
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 206
Spatial-Spectral Residuals Informed Diffusion Neural Operator for Pan-sharpening
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 207
PhyOceanCast: Global Ocean Forecasting with Physics-Informed Diffusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 208
Pixel Motion Diffusion is What We Need for Robot Control
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 210
M3Grounder: Mask-Based Multi-Span and Multi-Granular Grounding for Document QA
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 211
BabyVLM-V2: Toward Developmentally Grounded Pretraining and Benchmarking of Vision Foundation Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 212
Towards Real-World Document Parsing via Realistic Scene Synthesis and Document-Aware Training
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 213
RoadSceneBench: A Lightweight Benchmark for Mid-Level Road Scene Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 214
UNICBench: UNIfied Counting Benchmark for MLLM
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 215
CaptionQA: Is Your Caption as Useful as the Image Itself?
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 216
EgoProx: Evaluating MLLMs on Egocentric 3D Proximity Reasoning Across a Cognitive Hierarchy
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 217
VULCAN: Tool-Augmented Multi Agents for Iterative 3D Object Arrangement
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 219
Efficient Encoder-Free Fourier-based 3D Large Multimodal Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 220
Socratic-Geo: Synthetic Data Generation and Cross-Modal Geometric Reasoning via Multi-Agent Interaction
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 221
HAMMER: Harnessing MLLMs via Cross-Modal Integration for Intention-Driven 3D Affordance Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 222
Proxy3D: Efficient 3D Representations for Vision-Language Models via Semantic Clustering and Alignment
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 223
ReLaGS: Relational Language Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 224
3D-IDE: 3D Implicit Depth Emergent
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 226
Parse, Search, and Confirmation: Training-Free Aerial Vision-and-Dialog Navigation with Chain-of-Thought Reasoning and Structured Spatial Memory
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 227
4DP-QA: Scalable QA for 4D Perception in Vision Language Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 228
LASAR: Towards Spatio-temporal Reasoning with Latent Cognitive Map
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 229
Text-Phase Synergy Network with Dual Priors for Unsupervised Cross-Domain Image Retrieval
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 230
EagleNet: Energy-Aware Fine-Grained Relationship Learning Network for Text-Video Retrieval
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 231
PIX-TAB: Efficient PIXel-Precise TABle Structure Recognition Approach with Speculative Decoding and Region-Based Image Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 232
CARLoS: Retrieval via Concise Assessment Representation of LoRAs at Scale
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 233
Camouflage-aware Image-Text Retrieval via Expert Collaboration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 234
TriSim: Tri-Dimensional Similarity Modeling with Extreme Value Theory for False-Negative Mitigation in Remote Sensing Image-Text Retrieval
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 235
TIGER: A Unified Framework for Time, Images and Geo-location Retrieval
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 236
Mistake Attribution: Fine-Grained Mistake Understanding in Egocentric Videos
[
Slides]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 237
VidTAG: Temporally Aligned Video to GPS Geolocalization with Denoising Sequence Prediction at a Global Scale
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 238
Stitch-a-Demo: Creating Video Demonstrations from Multistep Descriptions
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 239
Prototypical Action Reasoning Facilitated by Vision-Language Alignment for Egocentric Action Anticipation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 240
AdaSpot: Spend Resolution Where It Matters for Precise Event Spotting
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 241
Unique Lives, Shared World: Learning from Single-Life Videos
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 242
Symphony: A Cognitively-Inspired Multi-Agent System for Long-Video Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 243
VideoARM: Agentic Reasoning over Hierarchical Memory for Long-Form Video Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 244
Wavelet-based Frame Selection by Detecting Semantic Boundary for Long Video Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 245
SVAgent: Storyline-guided Long Video Understanding via Cross-Modal Multi-Agent Collaboration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 246
Frame2Freq: Spectral Adapters for Fine-Grained Video Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 247
Structural Graph Probing of Vision–Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 248
Saliency-R1: Enforcing Interpretable and Faithful Vision-language Reasoning via Saliency-map Alignment Reward
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 250
MaskDiME: Adaptive Masked Diffusion for Precise and Efficient Visual Counterfactual Explanations
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 251
TRANSPORTER: Transferring Visual Semantics from VLM Manifolds
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 252
Relational Visual Similarity
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 253
PointCNN++: Performant Convolution on Native Points
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 254
Fast Markov Random Field Optimisation for Topologically Noisy 3D Shape Matching
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 255
LitePT: Lighter Yet Stronger Point Transformer
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 256
SuP: Sub-cloud Driven Point Cloud Registration
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 257
PQDT: Pseudo-Query Dual Transformer for Robust Point Cloud Restoration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 258
Test-Time Training for LiDAR Semantic Segmentation under Corruption via Geometric Inlier Discrimination
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 259
MHopReg: Efficient Hierarchical Multi-Hop Graph Search for Point Cloud Registration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 260
GEM: Generating LiDAR World Model via Deformable Mamba
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 261
Hybrid Robust Collaborative Perception with LiDAR-4D Radar Fusion under Adverse Weather Conditions
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 262
Task-Driven Implicit Representations for Automated Design of LiDAR Systems
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 263
Hierarchical Point-Patch Fusion with Adaptive Patch Codebook for 3D Shape Anomaly Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 264
When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 265
Beyond Layer-Wise Merging: Chain-of-Merging for Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 266
GazeShift: Unsupervised Gaze Estimation and Dataset for VR
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 267
Improving Calibration in Test-Time Prompt Tuning for Vision-Language Models via Data-Free Flatness-Aware Prompt Pretraining
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 268
Reevaluating the Intra-Modal Misalignment Hypothesis in CLIP
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 269
Dr. Seg: Revisiting GRPO Training for Visual Large Language Models through Perception-Oriented Design
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 270
Soft Modality-Guided Expert Specialization in MoE-VLMs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 271
CoVFT: Context-aware Visual Fine-tuning for Multimodal Large Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 273
AutoRegressive Generation with B-rep Holistic Token Sequence Representation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 274
VecGlypher: Unified Vector Glyph Generation with Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 275
NERFIFY: A Multi-Agent Framework for Turning NeRF Papers into Code
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 276
Diagram2Structure: Unlocking LLMs' Diagram Comprehension through DiagramDiff, an Offline Diagram Structuring Framework
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 277
ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 278
GardenDesigner: Encoding Aesthetic Principles into Jiangnan Garden Construction via a Chain of Agents
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 279
ShadowDraw: From Any Object to Shadow-Drawing Compositional Art
[
Slides]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 280
End-to-End Hyper-Relational Information Extraction for Engineering Diagrams via Dynamically Tokenized Relation Transformer
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 281
When Anonymity Breaks: Identifying Models Behind Text-to-Image Leaderboards
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 282
Bias at the End of the Score
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 283
PECCVAI: Overcoming the Brittleness of AI Image Watermarking Under Visual Paraphrasing Attacks
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 284
Dynamic Token Reweighting for Robust Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 285
COPYLENS: Towards Copyrighted Characters Infringement Detection via Copyright-Aware Prompt Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 286
Closed-Form Concept Erasure via Double Projections
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 287
Adaptive Bayesian Early-Exit Networks for Efficient Non-Transferable Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 288
Stake the Points: Structure-Faithful Instance Unlearning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 289
Federated Active Learning Under Extreme Non-IID and Global Class Imbalance
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 290
FedRG: Unleashing the Representation Geometry for Federated Learning with Noisy Clients
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 291
FedCART: Tackling Long-Tailed Distributions in Federated Adversarial Training via Classifier Refinement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 292
Generalized and Personalized Federated Learning with Black-Box Foundation Models via Orthogonal Transformations
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 293
Fully Decentralized Certified Unlearning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 294
Fed-ADE: Adaptive Learning Rate for Federated Post-adaptation under Distribution Shift
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 295
Towards Streaming Referring Video Segmentation via Large Language Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 296
Multi-speaker Attention Alignment for Multimodal Social Interaction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 297
OmniVTG: A Large-Scale Dataset and Training Paradigm for Open-World Video Temporal Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 298
SARL-STG: A Spatially Aware Reinforcement Learning Framework for Refining MLLMs in Spatio-Temporal Video Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 299
VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 301
UniCompress: Token Compression for Unified Vision–Language Understanding and Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 302
StreamingTOM: Streaming Token Compression for Efficient Video Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 303
SCoRe: Salience-Coverage Reduction for Vision Token Pruning in Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 304
VLM-PTQ: Efficient Post-Training Quantization for Large Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 305
Aligning What Vision-Language Models See and Perceive with Adaptive Information Flow
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 307
Rethinking Token Reduction for Large Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 308
Prototype-based Causal Intervention for Multi-Label Image Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 309
FAST: Topology-Aware Frequency-Domain Distribution Matching for Coreset Selection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 310
Face-Guided Sentiment Boundary Enhancement for Weakly-Supervised Temporal Sentiment Localization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 311
Evidential Deep Partial Label Learning to Quantify Disambiguation Uncertainty
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 312
Unlocking Strong Supervision: A Data-Centric Study of General-Purpose Audio Pre-Training Methods
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 313
Revisiting Learning with Noisy Labels: Active Forgetting and Noise Suppression
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 314
PAF: Perturbation-Aware Filtering for Open-Set Semi-Supervised Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 315
Global-Graph Guided and Local-Graph Weighted Contrastive Learning for Unified Clustering on Incomplete and Noise Multi-View Data
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 316
Enhancing Out-of-Distribution Detection with Extended Logit Normalization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 318
Unposed-to-3D: Learning Simulation-Ready Vehicles from Real-World Images
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 319
SafeDrive: Fine-Grained Safety Reasoning for End-to-End Driving in a Sparse World
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 320
RAG-TP: A General Framework for Vehicle Trajectory Prediction via Retrieval-Augmented Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 321
Perceiving the Near, Reasoning the Distant: Coherent Long-Horizon Trajectory Prediction for Autonomous Driving
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 322
Dual-Agent Reinforcement Learning for Adaptive and Cost-Aware Visual–Inertial Odometry
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 323
HorizonForge: Driving Scene Editing with Any Trajectories and Any Vehicles
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 324
AMap: Distilling Future Priors for Ahead-Aware Online HD Map Construction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 325
WAM-Flow: Parallel Coarse-to-Fine Motion Planning via Discrete Flow Matching for Autonomous Driving
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 326
PlannerRFT: Reinforcing Diffusion Planners through Closed-Loop and Sample-Efficient Fine-Tuning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 327
MARIS: Marine Open-Vocabulary Instance Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 328
XSeg: A Large-scale X-ray Contraband Segmentation Benchmark For Real-World Security Screening
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 329
Training-Free Open-Vocabulary Camouflaged Object Segmentation via Fine-Grained Object Binding and Adaptive Hybrid Prompt
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 330
M⁴-SAM: Multi-Modal Mixture-of-Experts with Memory-Augmented SAM for RGB-D Video Salient Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 331
ReAttnCLIP: Training-Free Open-Vocabulary Remote Sensing Image Segmentation via Re-defined Attention in CLIP
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 332
Mixture of Prototypes for Test-time Adaptive Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 333
Reconstruction-Guided Slot Curriculum: Addressing Object Over-Fragmentation in Video Object-Centric Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 334
ELVIS: Enhance Low-Light for Video Instance Segmentation in the Dark
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 335
Decouple Your Discovery and Memory in Continual Generalized Category Discovery
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 336
Beyond the Static World: Continual Category Discovery under Visual Drift
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 337
Memory-Efficient Transfer Learning with Fading Side Networks via Masked Dual Path Distillation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 338
SAME: Sparse and Anchored Model Editing for Heterogeneous Incremental Learning under Limited Data
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 339
CHEEM: Continual Learning by Reuse, New, Adapt and Skip - A Hierarchical Exploration-Exploitation Approach
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 340
Exemplar-Free Continual Learning for State Space Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 341
A Faster Path to Continual Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 342
Continual Learning for fMRI-Based Brain Disorder Diagnosis via Functional Connectivity Matrices Generative Replay
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 343
BeautyGRPO: Aesthetic Alignment for Face Retouching via Dynamic Path Guidance and Fine-Grained Preference Modeling
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 344
SyncDreamer: Controllable and Expressive Avatar Generation Beyond the Talking Head
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 345
PerformRecast: Expression and Head Pose Disentanglement for Portrait Video Editing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 346
UniLS: End-to-End Audio-Driven Avatars for Unified Listening and Speaking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 347
PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 348
FlashPortrait: 6x Faster Infinite Portrait Animation with Adaptive Latent Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 349
DriveVLN: Towards Mapless Vision-and-Language Navigation in Autonomous Driving
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 350
Towards Open Environments and Instructions: General Vision-Language Navigation via Fast-Slow Interactive Reasoning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 351
Unifying Language-Action Understanding and Generation for Autonomous Driving
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 352
Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 353
Prune2Drive: A Plug-and-Play Framework for Accelerating Vision-Language Models in Autonomous Driving
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 354
CGHair: Compact Gaussian Hair Reconstruction with Card Clustering
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 355
HyperGaussians: High-Dimensional Gaussian Splatting for High-Fidelity Animatable Face Avatars
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 356
Skullptor: High Fidelity 3D Head Reconstruction in Seconds with Multi-View Normal Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 357
RelightAnyone: A Generalized Relightable 3D Gaussian Head Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 358
Feed-forward Gaussian Registration for Head Avatar Creation and Editing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 359
Residual Decoding: Mitigating Hallucinations in Large Vision-Language Models via History-Aware Residual Guidance
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 360
Prefill-Time Intervention for Mitigating Hallucination in Large Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 361
SVHalluc: Benchmarking Speech–Vision Hallucination in Audio-Visual Large Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 362
Same Attention, Different Truths: Put Logit-Lens over Visual Attention to Detect and Mitigate LVLM Object Hallucination
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 363
Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 364
Lyapunov Probes for Hallucination Detection in Large Foundation Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 365
Captain Safari: A World Engine with Pose-Aligned 3D Memory
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 366
Gen3R: 3D Scene Generation Meets Feed-Forward Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 367
PerpetualWonder: Long-horizon Action-conditioned 4D Scene Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 368
CineScene: Implicit 3D as Effective Scene Representation for Cinematic Video Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 369
DreamStereo: Towards Real-Time Stereo Inpainting for HD Videos
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 371
RecEdit-Drive: 3D Reconstruction-Guided Spatiotemporal Video Editing for Autonomous Driving Scenes
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 372
RAYNOVA: Scale-Temporal Autoregressive World Modeling in Ray Space
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 373
RigMo: Unifying Rig and Motion Learning for Generative Animation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 374
LaVR: Scene Latent Conditioned Generative Video Trajectory Re-Rendering using Large 4D Reconstruction Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 376
Detect Anything via Next Point Prediction
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 378
Distribution-Aligned Multimodal Fusion for Robust Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 379
PaQ-DETR: Learning Pattern and Quality-Aware Dynamic Queries for Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 381
Efficiency Follows Global-Local Decoupling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 382
VRCLIP: Multimodal Canonical Correlation Alignment for CLIP-Driven Vision-Radio Person Re-Identification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 383
EReCu: Pseudo-label Evolution Fusion and Refinement with Multi-Cue Learning for Unsupervised Camouflage Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 384
Expert-Teacher-Student Collaborative Learning for Domain Adaptive Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 385
CI-VID: A Coherent Interleaved Text-Video Dataset
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 386
Generalizable Video Quality Assessment via Weak-to-Strong Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 387
EgoSound: Benchmarking Sound Understanding in Egocentric Videos
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 389
GIFT: Global Irreplaceability Frame Targeting for Efficient Video Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 390
Select Less, Reason More: Prioritizing Evidence Purity for Video Reasoning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 391
Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 392
Compositional Transformation Reasoning for Composed Video Retrieval
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 393
UniVBench: Towards Unified Evaluation for Video Foundation Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 394
NAMI: Efficient Image Generation via Bridged Progressive Rectified Flow Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 395
InverFill: One-Step Inversion for Enhanced Few-Step Diffusion Inpainting
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 396
TimeRipples: Accelerating vDiTs by Understanding the Spatio-Temporal Correlations in Latent Space
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 397
ProcessMaker: A Generalized Process Visualization Framework with Adaptive Sequence Steps on Diffusion Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 398
MeanFlow Transformers with Representation Autoencoders
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 399
DiT-IC: Aligned Diffusion Transformer for Efficient Image Compression
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 400
FARMER: Flow AutoRegressive Transformer over Pixels
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 401
Probabilistic Precipitation Nowcasting with Rectified Flow Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 402
FlowDC: Flow-Based Decoupling-Decay for Complex Image Editing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 403
High-Fidelity Diffusion Face Swapping with ID-Constrained Facial Conditioning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 404
3D-Object Perception Transformer (3PT)
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 405
SemLT3D: Semantic-Guided Expert Distillation for Camera-only Long-Tailed 3D Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 406
Spe-BEVHead: Rethinking the Detection Head Design for Bird’s-Eye-View Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 407
Unsupervised Multi-agent and Single-agent Perception from Cooperative Views
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 408
Zoo3D: Zero-Shot 3D Object Detection at Scene Level
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 409
Beyond Appearance: Camouflaged Object Detection via Geometric Structure
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 410
SABER: Spatially Consistent 3D Universal Adversarial Objects for BEV Detectors
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 411
AceTone: Bridging Words and Colors for Conditional Image Grading
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 412
Do VLMs Perceive or Recall? Probing Visual Perception vs. Memory with Classic Visual Illusions
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 413
Pixels Don't Lie (But Your Detector Might): Bootstrapping MLLM-as-a-Judge for Trustworthy Deepfake Detection and Reasoning Supervision
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 414
UI-Lens: Assessing General MLLMs’ Potential to Automate UI Display Quality Assurance
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 415
Seeing is Improving: Visual Feedback for Iterative Text Layout Refinement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 416
Is your VLM Sky-Ready? A Comprehensive Spatial Intelligence Benchmark for UAV Navigation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 417
Linking Perception, Confidence and Accuracy in MLLMs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 418
AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 419
Learning to Focus and Precise Cropping: A Reinforcement Learning Framework with Information Gaps and Grounding Loss for MLLMs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 420
From Pixel to Precision: Enhancing Handwritten Mathematical Expression Recognition with Image-Level Reward
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 421
Rethinking Pose Refinement in 3D Gaussian Splatting under Pose Prior and Geometric Uncertainty
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 422
Revisiting Pose Sensitivity in Splat-based Computed Tomography under Sparse-view Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 423
Seele: A Unified Acceleration Framework for Real-Time Gaussian Splatting on Mobile Devices
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 424
GHPT: Real-Time Relightable Gaussian Splatting using Hybrid Path Tracing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 425
PolarGuide-GSDR: 3D Gaussian Splatting Driven by Polarization Priors and Deferred Reflection for Real-World Reflective Scenes
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 426
EcoSplat: Efficiency-controllable Feed-forward 3D Gaussian Splatting from Multi-view Images
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 427
SGS-Intrinsic: Semantic-Invariant Gaussian Splatting for Sparse-View Indoor Inverse Rendering
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 428
GIFSplat: Generative Prior-Guided Iterative Feed-Forward 3D Gaussian Splatting from Sparse Views
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 429
3D Gaussian Splatting with Self-Constrained Priors for High Fidelity Surface Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 430
FilterGS: Traversal-Free Parallel Filtering and Adaptive Shrinking for Large-Scale LoD 3D Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 431
TWINGS: Thin Plate Splines Warp-aligned Initialization for Sparse-View Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 432
VarSplat: Uncertainty-aware 3D Gaussian Splatting for Robust RGB-D SLAM
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 434
FastGS: Training 3D Gaussian Splatting in 100 Seconds
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 435
BrepGaussian: CAD reconstruction from Multi-View Images with Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 436
ODGS-SLAM: Omnidirectional Gaussian Splatting SLAM
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 437
BA-GS: Bayesian Adaptive Gaussian Splatting for SFM-Free 3D Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 438
FSFSplatter: Geometrically Accurate Reconstruction with Free Sparse-view Images within 2 minutes
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 439
ViRC: Enhancing Visual Interleaved Mathematical CoT with Reason Chunking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 440
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 441
PixDLM: A Dual-Path Multimodal Language Model for UAV Reasoning Segmentation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 442
Can a Second-View Image Be a Language? Geometric and Semantic Cross-Modal Reasoning for X-ray Prohibited Item Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 443
VCU-Bridge: Hierarchical Visual Connotation Understanding via Semantic Bridging
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 444
Learning to See through Illumination Extremes with Event Streaming in Multimodal Large Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 445
VOLD: Reasoning Transfer from LLMs to Vision-Language Models via On-Policy Distillation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 446
Cut to the Chase: Training-free Multimodal Summarization via Chain-of-Events
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 447
UVU: Improving Multimodal Understanding via Vision-Language Unified Autoregressive Paradigm
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 448
PointThinker: Point-Incentivized Parallel Thinking for Multimodal Large Language Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 449
OctoMed: Data Recipes for State-of-the-Art Multimodal Medical Reasoning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 450
HoneyBee: Data Recipes for Vision-Language Reasoners
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 451
VisPlay: Self-Evolving Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 452
Chart-FR1: Visual Focus-Driven Fine-Grained Reasoning on Dense Charts
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 453
Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 454
ApET: Approximation-Error Guided Token Compression for Efficient VLMs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 455
Granulon: Awakening Pixel-Level Visual Encoders with Adaptive Multi-Granularity Semantics for MLLM
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 456
Vision Transformers Need More Than Registers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 457
Head-wise Adaptive Rotary Positional Encoding for Fine-Grained Image Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 458
PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 459
AdaSVD: Singular Value Decomposition with Adaptive Mechanisms for Large Multimodal Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 460
ReFTA: Breaking the Weight Reconstruction Bottleneck in Tensorized Parameter-Efficient Fine-Tuning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 461
HTTM: Head-wise Temporal Token Merging for Faster VGGT
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 463
Self-Attention Driven Tensor Representation for High-Order Data Recovery
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 464
PlanaReLoc: Camera Relocalization in 3D Planar Primitives via Region-Based Structure Matching
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 465
MOGeo: Beyond One-to-One Cross-View Object Geo-localization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 466
Homaloidal parametrization for detecting critical two-view configurations
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 467
AsymLoc: Towards Asymmetric Feature Matching for Efficient Visual Localization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 469
Asking like Socrates: Socrates helps VLMs understand remote sensing images
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 470
GTR-Turbo: Merged Checkpoint is Secretly a Free Teacher for Agentic VLM Training
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 471
Let VLMs Grade Their Own Thoughts: A Self-Quantification Approach to Reasoning-Aware Reward Modeling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 472
SciEducator: Scientific Video Understanding and Educating via Deming-Cycle Multi-Agent System
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 473
SenseSearch: Empowering Vision-Language Models with High-Resolution Agentic Search-Reasoning via Reinforcement Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 474
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 475
VideoSSR: Video Self-Supervised Reinforcement Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 476
Neurodynamics-Driven Coupled Neural P Systems for Multi-Focus Image Fusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 477
MagicFuse: Single Image Fusion for Visual and Semantic Reinforcement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 478
Bridging Pixels and Words: Mask-Aware Local Semantic Fusion for Multimodal Media Verification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 479
Human-Centric Multi-Exposure Fusion: Benchmark and Bi-level Cognition Distillation Framework
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 480
ConceptPose: Training-Free Zero-Shot Object Pose Estimation using Concept Vectors
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 481
A Closer Look at Cross-Domain Few-Shot Object Detection: Fine-Tuning Matters and Parallel Decoder Helps
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 482
NAF: Zero-Shot Feature Upsampling via Neighborhood Attention Filtering
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 483
Universal-to-Specific: Dynamic Knowledge-Guided Multiple Instance Learning for Few-Shot Whole Slide Image Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 484
SOTA: Self-adaptive Optimal Transport for Zero-Shot Classification with Multiple Foundation Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 485
Uni-DAD: Unified Distillation and Adaptation of Diffusion Models for Few-step Few-shot Image Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 488
IMS3: Breaking Distributional Aggregation in Diffusion-Based Dataset Distillation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 489
Continuous Exposure-Time Modeling for Realistic Atmospheric Turbulence Synthesis
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 490
240FPS Stereo Vision from Monocular Mixed Spikes
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 491
D^2-FOSA: Dual-Diffusion Guided EEG-to-Image Reconstruction with Frequency-Oriented Semantic Alignment
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 492
Self-Diffusion Driven Blind Imaging
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 493
Differentiable Stroke Planning with Dual Parameterization for Efficient and High-Fidelity Painting Creation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 494
Solvability of the Viewing Graph Under the Affine Camera Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 495
DiffBMP: Differentiable Rendering with Bitmap Primitives
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 496
Splat-Based Metal Artifact Reduction in Cone-Beam CT via Compact Attenuation Modeling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 497
Lumosaic: Hyperspectral Video via Active Illumination and Coded-Exposure Pixels
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 498
Towards Universal Computational Aberration Correction in Photographic Cameras: A Comprehensive Benchmark Analysis
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 499
Multi-View Hierarchical Alignment Learning for Spatial Transcriptomics
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 500
FEAST: Fully Connected Expressive Attention for Spatial Transcriptomics
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 502
OrienPose: Orientation-Guided Novel View Synthesis for Single-Image Unseen Object Pose Estimation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 503
Illustrator’s Depth: Monocular Layer Index Prediction for Image Decomposition
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 504
Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 505
Seeing Depth Through Frequency and Motion: A Progressive Training Paradigm for Monocular Depth Estimation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 506
GeoGuide: Hierarchical Geometric Guidance for Open-Vocabulary 3D Semantic Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 508
PE3R: Perception-Efficient 3D Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 509
GS-ASM: 2DGS-Supervised Active Stereo Matching
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 510
Real2Sim2Real: RetinalDepth-64K for Depth Estimation in Posterior Segment Ophthalmic Surgery
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 511
Iris: Bringing Real-World Priors into Diffusion Model for Monocular Depth Estimation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 512
InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 513
AirSim360: A Panoramic Simulation Platform within Drone View
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 514
Radar-Guided Polynomial Fitting for Metric Depth Estimation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 515
UniDAC: Universal Metric Depth Estimation for Any Camera
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 517
I-Scene: 3D Instance Models are Implicit Generalizable Spatial Learners
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 518
REVIVE 3D: Refinement via Encoded Voluminous Inflated prior for Volume Enhancement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 519
Muses: Designing, Composing, Generating Nonexistent Fantasy 3D Creatures without Training
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 520
EI-Part: Explode for Completion and Implode for Refinement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 521
MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 522
Fast3Dcache: Training-free 3D Geometry Synthesis Acceleration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 523
ViLearn: Accelerating Training Convergence of Image-to-3D Generation via Visibility Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 524
FlashMesh: Faster and Better Autoregressive Mesh Synthesis via Structured Speculation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 525
X-Part: High Fidelity And Structure Coherent Shape Decomposition And Completion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 526
Realiz3D: 3D Generation Made Photorealistic via Domain-Aware Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 527
TopoMesh: High-Fidelity Mesh Autoencoding via Topological Unification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 528
Nestwork: Conditional 3D Furnished House Layout Generation through Latent Heterogeneous Graph Diffusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 529
TEXTRIX: Latent Attribute Grid for Native Texture Generation and Beyond
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 530
Beyond Geometry: Artistic Disparity Synthesis for Immersive 2D-to-3D
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 531
WorldGen: From Text to Traversable and Interactive 3D Worlds
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 532
ExMesh: EXplicit Mesh Reconstruction with Topology Adaptation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 533
SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 534
ShapeR: Robust Conditional 3D Shape Generation from Casual Captures
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 536
3DrawAgent: Teaching LLM to Draw in 3D with Early Contrastive Experience
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 537
Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 538
HiFi-BRep: High-Fidelity Latent Representation for Robust B-Rep Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 539
PhysGen: Physically Grounded 3D Shape Generation for Industrial Design
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 540
Perceptual 3D Simulation With Physical World Modeling
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 541
EchoFoley: Event-Centric Hierarchical Control for Video Grounded Creative Sound Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 542
Active Intelligence in Video Avatars via Closed-loop World Modeling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 543
Enhancing Spatial Understanding in Image Generation via Reward Modeling
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 544
Seeing What Matters: Visual Preference Policy Optimization for Visual Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 545
TAG-MoE: Task-Aware Gating for Unified Generative Mixture-of-Experts
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 546
Identity-Preserving Image-to-Video Generation via Reward-Guided Optimization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 547
JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 548
Learning Latent Proxies for Controllable Single-Image Relighting
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 549
MoVie: Broaden Your Views with Human Motion for Action Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 550
MooCap: A Multi-View Benchmark for Cow-Object-Human Interaction and Behavior Dynamics
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 551
LAOF: Robust Latent Action Learning with Optical Flow Constraints
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 552
DarkAct: A RGB-Thermal Dataset and Fusion Framework for Multimodal Low-Light Action Recognition
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 554
Steering Where to Diffuse: Generative Modeling of Phenotypic Response Simulation with Steered Diffusion Bridge
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 557
RNN as Linear Transformer: A Closer Investigation into Representational Potentials of Visual Mamba Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 558
Coupling Liquid Time‑Constant Encoders with Modern Hopfield Memory
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 559
Stronger Normalization-Free Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 560
HCL-FF: Hierarchical and Contrastive Learning for Forward-Forward Algorithm
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 561
Can You Learn to See Without Images? Procedural Warm-Up for Vision Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 562
Convolutional Neural Networks Driven by Content Similarity
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 563
MorphSeek: Fine-grained Latent Representation-Level Policy Optimization for Deformable Image Registration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 564
HATS: Hardness-Aware Trajectory Synthesis for GUI Agents
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 565
MVP: Multiple View Prediction Improves GUI Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 566
Towards GUI Agents: Vision-Language Diffusion Models for GUI Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 567
ProactiveMobile: A Comprehensive Benchmark for Boosting Proactive Intelligence On Mobile Devices
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 568
OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 569
Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 571
Beyond Weak Supervision: MLLMs-Guided Graded Knowledge Distillation for Unsupervised Camouflaged Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 572
Detecting Unknown Objects via Energy-based Separation for Open World Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 573
Beyond Prompt Degradation: Prototype-guided Dual-pool Prompting for Incremental Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 574
SPAR: Single-Pass Any-Resolution ViT for Open-vocabulary Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 575
TTL: Test-time Textual Learning for OOD Detection with Pretrained Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 576
Parameterized Prompt for Incremental Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 577
SRA-Det: Learning Omni-Grained Open-Vocabulary Detection Beyond Category Names
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 578
Retrieve and Segment: Are a Few Examples Enough to Bridge the Supervision Gap in Open-Vocabulary Segmentation?
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 579
PCA-Seg: Revisiting Cost Aggregation for Open-Vocabulary Semantic and Part Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 580
Partial Weakly-Supervised Oriented Object Detection
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 581
Seeing Both Sides: Towards Bidirectional Semantic Alignment for Open-Vocabulary Camouflaged Object Segmentation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 582
Towards Robust Multi-Modal Semantic Segmentation with Teacher-Student Framework and Hybrid Prototype Distillation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 583
REL-SF4PASS: Panoramic Semantic Segmentation with REL Depth Representation and Spherical Fusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 584
Looking Beyond the Window: Global-Local Aligned CLIP for Training-free Open-Vocabulary Semantic Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 585
From Softmax to Dirichlet: Evidential Learning for Semi-supervised Semantic Segmentation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 586
Particulate: Feed-Forward 3D Object Articulation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 587
HOPS: Hierarchical Open-vocabulary Part Segmentation with Attention-Aware Filtering and Affinity-Guided Enhancement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 588
Shape-of-You: Fused Gromov-Wasserstein Optimal Transport for Semantic Correspondence in-the-Wild
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 589
MEMO: Human-like Crisp Edge Detection Using Masked Edge Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 590
MUFASA: A Multi-Layer Framework for Slot Attention
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 591
ChangeBridge: Spatiotemporal Image Generation with Multimodal Controls for Remote Senisng
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 592
MOMO: Mars Orbital MOdel Foundation Model for Mars Orbital Applications
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 593
Seeing Through the Noise: Improving Infrared Small Target Detection and Segmentation from Noise Suppression Perspective
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 594
GeoBridge: A Semantic-Anchored Multi-View Foundation Model Bridging Images and Text for Geo-Localization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 595
GeoSANE: Learning Geospatial Representations from Models, Not Data
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 596
Brewing Stronger Features: Dual-Teacher Distillation for Multispectral Earth Observation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 597
Spectral Super-Resolution via Adversarial Unfolding and Data-Driven Spectrum Regularization: From Multispectral Satellite Data to NASA Hyperspectral Image
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 598
RAMEN: Resolution-Adjustable Multimodal Encoder for Earth Observation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 599
ORSATR-X: A Foundation Model based on Differential-and-Excitation Networks for Optical Remote Sensing Object Recognition
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 600
SEBA: Sample-Efficient Black-Box Attacks on Visual Reinforcement Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 601
IAG: Input-aware Backdoor Attack on VLM-based Visual Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 602
DASH: A Meta-Attack Framework for Synthesizing Effective and Stealthy Adversarial Examples
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 603
AdapAction: Adaptive Target Action Backdoor Attack against GUI Agents
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 604
Phantom: Physical Object Interactions as Dynamic Triggers for NMS-Exploited Backdoors
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 605
Verifying Neural Network Robustness with Dual Perturbations
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 606
Defending Unauthorized Model Merging via Dual-Stage Weight Protection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 608
On the Role of Temporal Granularity in the Robustness of Spiking Neural Networks
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 609
Boosting Vision-Language-Action Finetuning with Feasible Action Neighborhood Prior
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 610
Exploring Conditions for Diffusion Models in Robotic Control
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 611
A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 612
Efficient Hybrid SE(3)-Equivariant Visuomotor Flow Policy via Spherical Harmonics for Robot Manipulation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 614
Scaling Spatial and Temporal Context for Robotic Imitation Learning Policies With Scene Graphs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 615
AdaDexTrack: Dynamic Modulation for Adaptive and Generalizable Dexterous Manipulation Tracking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 616
GraspLDP: Towards Generalizable Grasping Policy via Latent Diffusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 617
MoEActok: A MoE-based Action Tokenizer for Vision-Language-Action Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 618
A Cross-view Fusion Framework for Robust 6-DoF Grasp Pose Estimation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 619
SAVA-X: Ego-to-Exo Imitation Error Detection via Scene-Adaptive View Alignment and Bidirectional Cross View Fusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 620
PromptDepth: Efficient and Promptable Geometric 3D Vision Model for Embodied Intelligence
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 621
Gallant: Voxel Grid-based Humanoid Locomotion and Local-navigation across 3-D Constrained Terrains
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 622
PALM: Progress-Aware Policy Learning via Affordance Reasoning for Long-Horizon Robotic Manipulation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 623
IGen: Scalable Data Generation for Robot Learning from Open-World Images
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 624
Hypergraph-State Collaborative Reasoning for Multi-Object Tracking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 625
TGTrack: Temporal Generative Learning for Unified Single Object Tracking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 626
GeoMotion: Rethinking Motion Segmentation via Latent 4D Geometry
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 627
Generalizable Structure-Aware Keypoint Correspondence for Category-Unified 3D Single Object Tracking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 628
Generative Point Tracking and Forecasting
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 629
RAGTrack: Language-aware RGBT Tracking with Retrieval-Augmented Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 630
Dual-level Adaptation for Multi-Object Tracking: Building Test-Time Calibration from Experience and Intuition
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 631
GMT: Effective Global Framework for Multi-Target Multi-Camera Tracking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 632
Bridging Brain and Semantics: A Hierarchical Framework for Semantically Enhanced fMRI-to-Video Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 633
GraPHFormer: A Multimodal Graph Persistent Homology Transformer for the Analysis of Neuroscience Morphologies
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 634
DARC: Dual Adjustment Reasoning with Counterfactuals for Trustworthy Chest X-ray Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 635
Every Error has Its Magnitude: Asymmetric Mistake Severity Training for Multiclass Multiple Instance Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 636
Phrase-grounded APO for Improving Chest X-ray Report Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 637
Focus-to-Perceive Representation Learning: A Cognition-Inspired Hierarchical Framework for Endoscopic Video Analysis
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 638
OraPO: Oracle-educated Reinforcement Learning for Data-efficient and Factual Radiology Report Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 639
FluoCLIP: Stain-Aware Focus Quality Assessment in Fluorescence Microscopy
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 640
CryoKRAQEN: Kernel-Regularized Annealing for Quantized Embedding Networks in Cryo-EM Heterogeneous Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 641
Building Robust Vision Encoders for Cross-Dataset Evaluation in Immunofluorescent Microscopy
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 642
H2-Surv: Hierarchical Hyperbolic Multimodal Representation Learning for Survival Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 643
Dual-Level Hypergraph Generation for Addressing Feature Scarcity in Whole-Slide Image Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 644
Temporal Inversion for Learning Interval Change in Chest X-Rays
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 645
JUMP-Hand: Learning Joint-wise Uncertainty to Gate Mixture of View Experts for Multi-View 3D Hand Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 646
PAD-Hand: Physics-Aware Diffusion for Hand Motion Recovery
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 647
Anatomical Domain Shifts: Test-time Heterogeneous Adaptation for 3D Human Pose Prediction
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 648
Unlocking Motion from Large Vision Models with a Semantic and Kinematic Duality for Gait Recognition
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 649
Learning 3D Shape Fidelity Metric from Real-world Distortions
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 650
BarbieGait: An Identity-Consistent Synthetic Human Dataset with Versatile Cloth-Changing for Gait Recognition
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 651
FisherPoser: Human Motion Estimation from Sparse Observations with Hierarchical Region-Wise Fisher-Matrix Uncertainty Modeling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 652
EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 653
Ground Reaction Inertial Poser: Physics-based Human Motion Capture from Sparse IMUs and Insole Pressure Sensors
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 654
FUN REC Reconstructing Functional 3D Scenes from Egocentric Interaction Videos
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 655
VIMCAN: Visual-Inertial 3D Human Pose Estimation with Hybrid Mamba-Cross-Attention Network
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 656
Bringing Your Portrait to 3D Presence
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 657
FLOW: Feature-Level Optimal Warping for Generalized Remote Physiological Measurement
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 659
UniMMAD: Unified Multi-Modal and Multi-Class Anomaly Detection via MoE-Driven Feature Decompression
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 660
BUSSARD: Normalizing Flows for Bijective Universal Scene-Specific Anomalous Relationship Detection
[
Slides]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 661
Multi-Prototype Compactness and Boundary-Aware Synthesis for Unsupervised Anomaly Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 662
PDD: Manifold-Prior Diverse Distillation for Medical Anomaly Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 663
Weakly Supervised Video Anomaly Detection with Anomaly-Connected Components and Intention Reasoning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 664
SubspaceAD: Training-Free Few-Shot Anomaly Detection via Subspace Modeling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 665
Learning Spatial-Temporal Consistency for 3D Semantic Scene Completion
[
Slides]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 666
Generalizing Visual Geometry Priors to Sparse Gaussian Occupancy Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 667
Deformable Gaussian Occupancy: Decoupling Rigid and Nonrigid Motion with Factorized Distillation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 668
OccAny: Generalized Unconstrained Urban 3D Occupancy
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 669
Dr.Occ: Depth- and Region-Guided 3D Occupancy from Surround-View Cameras for Autonomous Driving
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 670
ShelfOcc: Native 3D Supervision beyond LiDAR for Vision-Based Occupancy Estimation
[
Poster]
Poster Session
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A None
Poster Session 4 & Exhibit Hall w/ Coffee Break
Art Program
Sat Jun 06 04:00 PM -- 04:30 PM (PDT) @ ExHall F None
Art Gallery Tour with Curator and Artists
Reception
Sat Jun 06 06:00 PM -- 08:00 PM (PDT) @ Bluebird Ballroom & ExHall C None
Reception
Break
Sun Jun 07 06:30 AM -- 08:00 AM (PDT) @ ExHall C None
Breakfast
Registration
Sun Jun 07 06:30 AM -- 12:00 PM (PDT) @ Lobby A None
Registration / Badge Pickup
Oral
Sun Jun 07 08:00 AM -- 08:12 AM (PDT) @ Four Seasons Ballroom None
AToken: A Unified Tokenizer for Vision
Oral
Sun Jun 07 08:00 AM -- 08:12 AM (PDT) @ Bluebird Ballroom None
Evidential Neural Radiance Fields
Oral
Sun Jun 07 08:00 AM -- 08:12 AM (PDT) @ Mile High Ballroom 3A - 4A None
BoostSLT: Boosting Sign Language Translation via a Plug-and-Play Diffusion-Based Semantic Enhancer
Oral
Sun Jun 07 08:00 AM -- 08:12 AM (PDT) @ Mile High Ballroom 1A - 2A None
AT-VLA: Adaptive Tactile Injection for Enhanced Feedback Reaction in Vision-Language-Action Models
Oral Session
Sun Jun 07 08:00 AM -- 09:15 AM (PDT) @ Mile High Ballroom 1A - 2A None
Oral Session 5C: Geometry and Robotics
Oral Session
Sun Jun 07 08:00 AM -- 09:15 AM (PDT) @ Four Seasons Ballroom None
Oral Session 5B: Generalization and Adaptation
Oral Session
Sun Jun 07 08:00 AM -- 09:15 AM (PDT) @ Mile High Ballroom 3A - 4A None
Oral Session 5D: Human-Centric Modeling & Lighting
Oral Session
Sun Jun 07 08:00 AM -- 09:15 AM (PDT) @ Bluebird Ballroom None
Oral Session 5A: Dynamic Perception
Oral
Sun Jun 07 08:12 AM -- 08:25 AM (PDT) @ Mile High Ballroom 3A - 4A None
ImmerIris: A Large-Scale Dataset and Benchmark for Off-Axis and Unconstrained Iris Recognition in Immersive Applications
Oral
Sun Jun 07 08:12 AM -- 08:25 AM (PDT) @ Bluebird Ballroom None
Global-Aware Edge Prioritization for Pose Graph Initialization
Oral
Sun Jun 07 08:12 AM -- 08:25 AM (PDT) @ Mile High Ballroom 1A - 2A None
Learning Diffeomorphism for Medical Image Registration with Time-Embedded Architectures Using Semigroup Regularization
Oral
Sun Jun 07 08:12 AM -- 08:25 AM (PDT) @ Four Seasons Ballroom None
Confusion-Aware Spectral Regularizer for Long-Tailed Recognition
Oral
Sun Jun 07 08:25 AM -- 08:37 AM (PDT) @ Four Seasons Ballroom None
Learning Latent Concepts for Detecting Out-of-Distribution Objects
Oral
Sun Jun 07 08:25 AM -- 08:37 AM (PDT) @ Mile High Ballroom 3A - 4A None
OLATverse: A Large-scale Real-world Object Dataset with Precise Lighting Control
Oral
Sun Jun 07 08:25 AM -- 08:37 AM (PDT) @ Bluebird Ballroom None
Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding
Oral
Sun Jun 07 08:25 AM -- 08:37 AM (PDT) @ Mile High Ballroom 1A - 2A None
QuadSync: Quadrifocal Tensor Synchronization via Tucker Decomposition
Oral
Sun Jun 07 08:37 AM -- 08:50 AM (PDT) @ Bluebird Ballroom None
Optical Flow Matching: Reframing Optical Flow as Continuous Transport Dynamics
Oral
Sun Jun 07 08:37 AM -- 08:50 AM (PDT) @ Four Seasons Ballroom None
Learning Like Humans: Analogical Concept Learning for Generalized Category Discovery
Oral
Sun Jun 07 08:37 AM -- 08:50 AM (PDT) @ Mile High Ballroom 1A - 2A None
SocialNav: Training Human-Inspired Foundation Model for Socially-Aware Embodied Navigation
Oral
Sun Jun 07 08:37 AM -- 08:50 AM (PDT) @ Mile High Ballroom 3A - 4A None
OpenDance: Multimodal Controllable 3D Dance Generation with Large-scale Internet Data
Oral
Sun Jun 07 08:50 AM -- 09:02 AM (PDT) @ Mile High Ballroom 3A - 4A None
POLAR: A Portrait OLAT Dataset and Generative Framework for Illumination-Aware Face Modeling
Oral
Sun Jun 07 08:50 AM -- 09:02 AM (PDT) @ Mile High Ballroom 1A - 2A None
Structural Action Transformer for 3D Dexterous Manipulation
Oral
Sun Jun 07 08:50 AM -- 09:02 AM (PDT) @ Bluebird Ballroom None
SEATrack: Simple, Efficient, and Adaptive Multimodal Tracker
Oral
Sun Jun 07 08:50 AM -- 09:02 AM (PDT) @ Four Seasons Ballroom None
Understanding and Enforcing Weight Disentanglement in Task Arithmetic
Oral
Sun Jun 07 09:02 AM -- 09:15 AM (PDT) @ Mile High Ballroom 1A - 2A None
TESO: Online Tracking of Essential Matrix by Stochastic Optimization
Oral
Sun Jun 07 09:02 AM -- 09:15 AM (PDT) @ Mile High Ballroom 3A - 4A None
Relightable Holoported Characters: Capturing and Relighting Dynamic Human Performance from Sparse Views
Oral
Sun Jun 07 09:02 AM -- 09:15 AM (PDT) @ Bluebird Ballroom None
U^2Flow: Uncertainty-Aware Unsupervised Optical Flow Estimation
Oral
Sun Jun 07 09:02 AM -- 09:15 AM (PDT) @ Four Seasons Ballroom None
Understanding Task Transfer in Vision-Language Models
[
Slides]
Break
Sun Jun 07 09:15 AM -- 09:30 AM (PDT) None
Courtesy Break
Keynote
Sun Jun 07 09:30 AM -- 10:30 AM (PDT) @ Bluebird Ballroom None
Scaling Laws vs. Neural Laws: Toward More Natural Artificial Vision
Poster Setup
Sun Jun 07 10:15 AM -- 10:45 AM (PDT) @ ExHall A None
Poster Setup
Demonstration
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F None
Demos
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 1
Evidential Neural Radiance Fields
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 2
Global-Aware Edge Prioritization for Pose Graph Initialization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 3
Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 4
Optical Flow Matching: Reframing Optical Flow as Continuous Transport Dynamics
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 5
SEATrack: Simple, Efficient, and Adaptive Multimodal Tracker
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 6
U^2Flow: Uncertainty-Aware Unsupervised Optical Flow Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 7
AToken: A Unified Tokenizer for Vision
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 8
Confusion-Aware Spectral Regularizer for Long-Tailed Recognition
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 9
Learning Latent Concepts for Detecting Out-of-Distribution Objects
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 10
Learning Like Humans: Analogical Concept Learning for Generalized Category Discovery
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 11
Understanding and Enforcing Weight Disentanglement in Task Arithmetic
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 13
AT-VLA: Adaptive Tactile Injection for Enhanced Feedback Reaction in Vision-Language-Action Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 14
Learning Diffeomorphism for Medical Image Registration with Time-Embedded Architectures Using Semigroup Regularization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 15
QuadSync: Quadrifocal Tensor Synchronization via Tucker Decomposition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 16
SocialNav: Training Human-Inspired Foundation Model for Socially-Aware Embodied Navigation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 17
Structural Action Transformer for 3D Dexterous Manipulation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 18
TESO: Online Tracking of Essential Matrix by Stochastic Optimization
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 19
BoostSLT: Boosting Sign Language Translation via a Plug-and-Play Diffusion-Based Semantic Enhancer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 20
ImmerIris: A Large-Scale Dataset and Benchmark for Off-Axis and Unconstrained Iris Recognition in Immersive Applications
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 21
OLATverse: A Large-scale Real-world Object Dataset with Precise Lighting Control
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 22
OpenDance: Multimodal Controllable 3D Dance Generation with Large-scale Internet Data
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 23
POLAR: A Portrait OLAT Dataset and Generative Framework for Illumination-Aware Face Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 24
Relightable Holoported Characters: Capturing and Relighting Dynamic Human Performance from Sparse Views
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 25
Scaling View Synthesis Transformers
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 26
WildPose: A Unified Framework for Robust Pose Estimation in the Wild
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 27
MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 28
Revisiting Monocular SLAM with Spatio-Temporal Scene Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 29
Minimal Constraint Relaxation for Multiview Autocalibration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 30
Motion 3-to-4: 3D Motion Reconstruction for 4D Synthesis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 31
GGPT: Geometry-Grounded Point Transformer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 32
MERG3R: A Divide-and-Conquer Approach to Large-Scale Neural Visual Geometry
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 33
Unlocking the Power of Critical Factors for 3D Visual Geometry Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 34
KV-Tracker: Real-Time Pose Tracking with Transformers
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 35
InstructMix2Mix: Consistent Sparse-View Editing Through Multi-View Model Personalization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 36
From Rays to Projections: Better Inputs for Feed-Forward View Synthesis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 37
SLARM: Streaming and Language-Aligned Reconstruction Model for Dynamic Scenes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 38
Parallel Rigidity Matters for Bundle Adjustment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 39
Simple but Effective Triplet-Based Compression Strategies for Compact Visual Localization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 40
VIAFormer: Voxel-Image Alignment Transformer for High-Fidelity Voxel Refinement
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 41
Mining Attribute Subspaces for Efficient Fine-tuning of 3D Foundation Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 42
DualPrim: Compact 3D Reconstruction with Positive and Negative Primitives
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 43
StyleGallery: Training-free and Semantic-aware Personalized Style Transfer from Arbitrary Image References
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 44
DynFusion: Rethinking Condition Fusion for Adaptive Multi-Conditional Text-to-Image Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 45
Agentic Retoucher for Text-To-Image Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 46
StyleDoctor: Towards Specialist Reward Model for Style-centric Generation Tasks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 47
SwitchCraft: Training-Free Multi-Event Video Generation with Attention Controls
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 48
Premier: Personalized Preference Modulation with Learnable User Embedding in Text-to-Image Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 49
Paper2Figure: A Multi-Agent Collaborative System for Figure Generation Towards Academic Research Paper
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 50
Adapting In-context Generation for Enhanced Composed Image Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 51
Transition Models: Rethinking the Generative Learning Objective
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 52
Rethinking Glyph Spatial Information in Font Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 53
StreamDiT: Real-Time Streaming Text-to-Video Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 54
ChArtist: Generating Pictorial Charts with Unified Spatial and Subject Control
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 55
Camera Control for Text-to-Image Generation via Learning Viewpoint Tokens
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 56
3D Space as a Scratchpad for Editable Text-to-Image Generation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 57
Aligning Multi-Character Narrative Image Generation with Multi-Aspect Human Preferences
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 58
FoleyDirector: Directing Temporal Controllable Video-to-Audio Generation via Fine-Grained Temporal Scripts
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 59
DCoAR: Deep Concept Injection into Unified Autoregressive Models for Personalized Text-to-Image Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 60
DreamOmni2: Multimodal Instruction-based Generation and Editing
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 61
AutoDebias: An Automated Framework for Detecting and Mitigating Backdoor Biases in Text-to-Image Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 62
PosterIQ: A Design Perspective Benchmark for Poster Understanding and Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 63
IVAAN: Instance-level Vision-Language Alignment via Attribute-Guided Text Prompts Generation for Nuclei Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 64
IsoCLIP: Decomposing CLIP Projectors for Efficient Intra-modal Alignment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 65
TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 66
BioVITA: Biological Dataset, Model, and Benchmark for Visual-Textual-Acoustic Alignment
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 67
Boosting Visual Reprogramming for CLIP with Dual Granularity Alignment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 68
Decouple to Generalize: Context-First Self-Evolving Learning for Data-Scarce Vision-Language Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 69
UniGen-1.5: Enhancing Image Generation and Editing through Reward Unification in RL
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 70
PolySLGen: Online Multimodal Speaking-Listening Reaction Generation in Polyadic Interaction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 71
Label What Matters: Modality-Balanced and Difficulty-Aware Multimodal Active Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 72
Unified Personalized Understanding, Generating and Editing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 73
MSRL: Scaling Generative Multimodal Reward Modeling via Multi-Stage Reinforcement Learning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 74
Towards Uncertainty-aware Unsupervised Domain Adaptation for Videos and Time-Series with Causal Optimal Transport
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 75
Foundation Model Priors Enhance Object Focus in Feature Space for Source-Free Object Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 76
Decision Boundary-aware Generation for Long-tailed Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 77
Towards Stable Federated Continual Test-Time Adaptation in Wild World
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 78
HyCal: A Training-Free Prototype Calibration Method for Cross-Discipline Few-Shot Class-Incremental Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 79
ACE-Merging: Data-Free Model Merging with Adaptive Covariance Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 80
CHIPS: Efficient CLIP Adaptation via Curvature-aware Hybrid Influence-based Data Selection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 81
Addressing Exacerbated Attention Sink for Source-Free Cross-Domain Few-Shot Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 82
Depth Hypothesis Guided Iterative Refinement for Event–Image Monocular Depth Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 83
High-Quality and Efficient Turbulence Mitigation with Events
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 84
Tracking through Severe Occlusion via Event-Derived Transient Cues
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 85
FastEventDGS: Deformable Gaussian Splatting for Fast Dynamic Scenes from a Single Event Camera
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 86
Event-Based Motion Deblurring Using Task-Oriented 3D Gaussian Event Representations
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 87
From Corners to Fiducial Tags: Revisiting Checkerboard Calibration for Event Cameras
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 88
Extending Embodied Question Answering from Perception to Decision
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 89
Dejavu: Towards Experience Feedback Learning for Embodied Intelligence
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 90
Demo2Tutorial: From Human Experience to Multimodal Software Tutorials
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 91
MaskDexGrasp: Generative Masked Modeling for Part-Aware Dexterous Grasp Synthesis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 92
Predict Before You Explore: Predictive Planning with Specialized Memory for Embodied Question Answering
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 93
VideoWeaver: Multimodal Multi-View Video-to-Video Transfer for Embodied Agents
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 94
MindPower: Enabling Theory-of-Mind Reasoning in VLM-based Embodied Agents
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 95
Align While Search: Belief-Guided Exploratory Inference for World-Grounded Embodied Agents
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 96
Rethinking Intermediate Representation for VLM-based Robot Manipulation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 98
FantasyVLN: Unified Multimodal Chain-of-Thought Reasoning for Vision-and-Language Navigation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 99
UniLight: A Unified Representation for Lighting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 100
MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 101
Upsample Anything: A Simple and Hard to Beat Baseline for Feature Upsampling
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 102
Hist2Style: Histogram-Guided Stylization with Bilateral Grids
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 103
Harmonic Canvas: Inversion-Free Editing for Visually-Guided Music Style Transfer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 104
How to Take a Memorable Picture? Empowering Users with Actionable Feedback
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 105
UniEdit-I: Training-free Image Editing for Unified VLM via Iterative Understanding, Editing and Verifying
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 106
SCIEval: Evaluating and Benchmarking the Faithfulness of Scientific Image Generation and Interpretation with Large Multimodal Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 107
GeoRelight: Learning Joint Geometrical Reconstruction and Relighting with Flexible Multi-Modal Diffusion Transformers
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 108
HAD: Hallucination-Aware Diffusion Priors for 3D Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 109
Catalyst4D: High-Fidelity 3D-to-4D Scene Editing via Dynamic Propagation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 110
ReFlow: Self-correction Motion Learning for Dynamic Scene Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 111
Semantic Foam: Unifying Spatial and Semantic Scene Decomposition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 112
NVGS: Neural Visibility for Occlusion Culling in 3D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 113
NeAR: Coupled Neural Asset–Renderer Stack
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 114
Thermal is Always Wild: Characterizing and Addressing Challenges in Thermal-Only Novel View Synthesis
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 115
PhysGM: Large Physical Gaussian Model for Feed-Forward 4D Synthesis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 116
Life-IQA: Boosting Blind Image Quality Assessment through GCN-enhanced Layer Interaction and MoE-based Feature Decoupling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 117
TM-BSN: Triangular-Masked Blind-Spot Network for Real-World Self-Supervised Image Denoising
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 118
Multinex: Lightweight Low-light Image Enhancement via Multi-prior Retinex
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 119
Beyond Ground-Truth: Leveraging Image Quality Priors for Real-World Image Restoration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 120
ExpoCM: Exposure-Aware One-Step Generative Single-Image HDR Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 121
Physically-Grounded Turbulence Mitigation with Frame-Shared Degradation Parameters
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 122
Convexity-Aware Noise Calibration: A Self-Supervised Framework for Noise-Level-Unknown Image Denoising
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 123
UCMNet: Uncertainty-Aware Context Memory Network for Under-Display Camera Image Restoration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 124
Beyond the Ground Truth: Enhanced Supervision for Image Restoration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 125
ShiftLUT: Spatial Shift Enhanced Look-Up Tables for Efficient Image Restoration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 126
Bilevel Layer-Positioning LoRA for Real Image Dehazing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 127
SD-FSMIS: Adapting Stable Diffusion for Few-Shot Medical Image Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 129
SHAPE: Structure-aware Hierarchical Unsupervised Domain Adaptation with Plausibility Evaluation for Medical Image Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 130
Delving Aleatoric Uncertainty in Medical Image Segmentation via Vision Foundation Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 131
Revisiting 2D Foundation Models for Scalable 3D Medical Image Classification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 133
Simple-ViLMedSAM: Simple Text Prompts Meet Vision-Language Models for Medical Image Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 134
NeuroSeg Meets DINOv3: Transferring 2D Self-Supervised Visual Priors to 3D Neuron Segmentation via DINOv3 Initialization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 135
Multi-Paradigm Collaborative Adversarial Attack Against Multi-Modal Large Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 136
TINA: Text-Free Inversion Attack for Unlearned Text-to-Image Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 137
Jailbreaking Vision-Language Models via Dissonance-Guided Suffix Optimization and Image–Phrase Injection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 138
BlackMirror: Black-Box Backdoor Detection for Text-to-Image Models via Instruction-Response Deviation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 139
VCP-Attack: Visual-Contrastive Projection for Transferable Black-Box Targeted Attacks on Large Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 140
Adapter Shield: A Unified Framework with Built-in Authentication for Preventing Unauthorized Zero-Shot Image-to-Image Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 141
LLaVAShield: Safeguarding Multimodal Multi-Turn Dialogues in Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 142
Transform to Transfer: Boosting Adversarial Attack Transferability on Vision-Language Pre-training Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 143
Mask to Align, Weight to Disambiguate: Reliable Unsupervised Cross-Modal Hashing with Masked-Weight Contrast
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 144
Reliable Clustering Number Estimation for Contrastive Multi-View Clustering
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 145
Pushing the Frontier of Audiovisual Perception with Large-Scale Multimodal Correspondence Learning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 146
Enhance-then-Balance Modality Collaboration for Robust Multimodal Sentiment Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 147
SonoWorld: From One Image to a 3D Audio-Visual Scene
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 149
EXOTIC: External Vision-driven Incomplete Multi-view Classification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 150
Easy2Hard: From Partially to Fully Unmatched Modalities as Negative Samples in Contrastive Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 151
OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 152
BALM: A Model-Agnostic Framework for Balanced Multimodal Learning under Imbalanced Missing Rates
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 153
UniT: Unified Multimodal Chain-of-Thought Test-time Scaling
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 154
Multi-modal Test-time Adaptation via Adaptive Probabilistic Gaussian Calibration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 155
Information-Theoretic Decomposition for Multimodal Interaction Learning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 156
Is the Modality Gap a Bug or a Feature? A Robustness Perspective
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 157
Omni-Fake: Benchmarking Unified Multimodal Social Media Deepfake Detection
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 158
MUST: Modality-Specific Representation-Aware Transformer for Diffusion-Enhanced Survival Prediction with Missing Modality
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 159
VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 160
MOS: Mitigating Optical-SAR Modality Gap for Cross-Modal Ship Re-Identification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 161
SeD-UD: An Influence-Driven and Hierarchically-Decoupled Information Bottleneck for Multimodal Intent Recognition
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 162
MultiModalPFN: Extending Prior-Data Fitted Networks for Multimodal Tabular Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 163
LacTokGen: Latent Consistency Tokenizer for 1024-pixel Image Generation by 256 Tokens
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 164
FlowSteer: Guiding Few-Step Image Synthesis with Authentic Trajectories
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 165
Visual Autoregressive Modeling via Next Focus Prediction
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 166
Semantic Context Matters: Improving Conditioning for Autoregressive Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 167
TempoMaster: Efficient Long Video Generation via Next-Frame-Rate Prediction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 168
FlashIn: Fast and Accurate Image Inversion for Real-time Image Editing
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 169
EasyV2V: A High-quality Instruction-based Video Editing Framework
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 170
One Algorithm to Align Them All
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 171
VGA-Bench: A Unified Benchmark and Multi-Model Framework for Video Aesthetics and Generation Quality Evaluation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 172
Improved Mean Flows: On the Challenges of Fastforward Generative Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 173
SynMotion: Semantic-Visual Adaptation for Motion Customized Video Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 174
Match-and-Fuse: Consistent Generation from Unstructured Image Sets
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 175
Mixture of Style Experts for Diverse Image Stylization
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 176
Mirai: Autoregressive Visual Generation Needs Foresight
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 177
Align Images Before You Generate
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 178
Bridging the Perception Gap in Image Super-Resolution Evaluation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 179
Time-Aware One Step Diffusion Network for Real-World Image Super-Resolution
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 180
Restore Text First, Enhance Image Later: Two-Stage Scene Text Image Super-Resolution with Glyph Structure Guidance
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 181
IAFMNet: Information-Aware Feature Modulation for Efficient Super-Resolution
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 182
Physics-Consistent Diffusion for Efficient Fluid Super-Resolution via Multiscale Residual Correction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 183
Bridging Fidelity-Reality with Controllable One-Step Diffusion for Image Super-Resolution
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 184
Omni-Supervised Motion Editing: Balancing Change and Invariance through Positive-Negative Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 185
FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 186
Cross-Axis Feature Fusion with Joint-Wise Motion Difference Prediction for Text-Based 3D Human Motion Editing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 187
MotionMaster: Generalizable Text-Driven Motion Generation and Editing
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 188
OpenT2M: No-frill Motion Generation with Open-source, Large-scale, High-quality Data
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 189
Towards Decompositional Human Motion Generation with Energy-Based Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 190
PAMotion: Physics-Aware Motion Generation for Full-Body Interaction with Multiple Objects
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 191
Sketch2Colab: Sketch-Conditioned Multi-Human Animation via Controllable Flow Distillation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 192
ViHOI: Human-Object Interaction Synthesis with Visual Priors
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 193
CLEP: Contrastive Language-Pose Pretraining
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 194
OpenFS: Multi-Hand-Capable Fingerspelling Recognition with Implicit Signing-Hand Detection and Frame-Wise Letter-Conditioned Synthesis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 195
ARMFlow: AutoRegressive MeanFlow for Online 3D Human Reaction Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 196
InterPhys: Physics-aware Human Motion Synthesis in a Dynamic Scene
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 197
Beyond Mimicry: Learning Whole-Body Human-Humanoid Interaction from Human-Human Demonstrations
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 198
PHAC: Promptable Human Amodal Completion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 199
CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 200
IntrinsicWeather: Controllable Weather Editing in Intrinsic Space
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 201
Outlier-Robust Diffusion Solvers for Inverse Problems
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 202
Beyond Fixed Formulas: Data-Driven Linear Predictor for Efficient Diffusion Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 204
Diff-SemiER: Transparency-Aware Adaptive Fusion Diffusion Model with Generative Prior for Semi-Transparent Eyeglasses Removal
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 205
KLIP: Localized Distribution Shift Detection via KL-Divergence with Diffusion Priors in Inverse Problems
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 206
Elucidating the Design Space of Arbitrary-Noise-Based Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 207
Taming Generative Diffusion Model for Task-Oriented Infrared Imaging
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 208
Attention, May I Have Your Decision? Localizing Generative Choices in Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 209
RxnCaption: Reformulating Reaction Diagram Parsing as Visual Prompt Guided Captioning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 210
More than the Sum: Panorama-Language Models for Adverse Omni-Scenes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 211
DiGraphHal-Bench: Evaluating Multimodal Large Language Models on Complex Directed Graphs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 212
SEA-Vision: A Multilingual Benchmark for Comprehensive Document and Scene Text Understanding in Southeast Asia
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 214
Spot The Ball: A Benchmark for Visual Social Inference
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 216
E-comIQ-ZH: A Human-Aligned Dataset and Benchmark for Fine-Grained Evaluation of E-commerce Posters with Chain-of-Thought
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 217
GeoWorld: Geometric World Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 218
ORD: Object-Relation Decoupling for Generalized 3D Visual Grounding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 219
Benchmarking PhD-Level Coding in 3D Geometric Computer Vision
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 220
MonoVLM: Monocular 3D Visual Grounding with Vision Language Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 221
Curvature-Aware Captioning: Leveraging Geodesic Attention for 3D Scene Understanding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 222
SPREAD: Spatial-Physical REasoning via geometry Aware Diffusion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 223
ExtrinSplat: Decoupling Geometry and Semantics for Open-Vocabulary Understanding in 3D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 224
SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 225
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 226
VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 227
Merge3D: Efficient 3D Multimodal LLMs via Joint 2D-3D Token Merging
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 228
Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 229
LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 230
Quota-Calibrated Fine-Grained Alignment with Context-Aware Marginals for Text-based Person Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 231
Evo-Retriever: LLM-Guided Curriculum Evolution with Viewpoint-Pathway Collaboration for Multimodal Document Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 232
Taxonomy-Aware Representation Alignment for Hierarchical Visual Recognition with Large Multimodal Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 233
FAAR: Efficient Frequency-Aware Multi-Task Fine-Tuning via Automatic Rank Selection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 234
Model Merging in the Essential Subspace
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 235
Beyond Semantic Search: Towards Referential Anchoring in Composed Image Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 236
SAVE: Speech-Aware Video Representation Learning for Video-Text Retrieval
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 237
MarkushGrapher-2: End-to-end Multimodal Recognition of Chemical Structures
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 239
EthoCLIP: Ontology-Enhanced Video-Language Pretraining for Animal Behavior Understanding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 240
TrajTok: Learning Trajectory Tokens Enhances Video Understanding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 241
Streaming Video Instruction Tuning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 242
VidPrism: Heterogeneous Mixture of Experts for Image-to-Video Transfer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 244
From Static to Dynamic: Exploring Self-supervised Image-to-Video Representation Transfer Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 245
Learnable Motion-Focused Tokenization for Effective and Efficient Video Unsupervised Domain Adaptation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 246
FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 247
Learning Transferable Temporal Primitives for Video Reasoning via Synthetic Videos
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 248
Video Panels for Long Video Understanding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 249
Gaze Target Estimation Anywhere with Concepts
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 250
Select, Hypothesize and Verify: Towards Verified Neuron Concept Interpretation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 251
Finding Distributed Object-Centric Properties in Self-Supervised Transformers
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 253
See Through the Noise: Improving Domain Generalization in Gaze Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 254
Mechanisms of Object Localization in Vision–Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 255
mmWaveFlow: Unified Enhancement and Generation of mmWave Human Point Clouds
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 256
From Feature Learning to Spectral Basis Learning: A Unifying and Flexible Framework for Efficient and Robust Shape Matching
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 257
Topology-aware Feature Propagation for Unsupervised Non-rigid Point Cloud Correspondence
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 258
BEV-SLD: Self-Supervised Scene Landmark Detection for Global Localization with LiDAR Bird’s-Eye View Images
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 259
SAG-GNN: Semantic-Aware Guided GNN for Descriptor-Free 2D-3D Matching
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 260
LiREC-Net: A Target-Free and Learning-Based Network for LiDAR, RGB, and Event Calibration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 261
GM-R^2: Generative Matching Learning for Unsupervised Geometric Representation and Registration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 262
4D Local Modeling Toward Dynamic Global Perception for Ambiguity-free Rotation-Invariant Point Cloud Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 263
PointNSP: Autoregressive 3D Point Cloud Generation with Next-Scale Level-of-Detail Prediction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 264
MORE-STEM: Long-Short MemOry REcall and Spatio-TEmporal Consistency Model for Query-Driven 3D/4D Point Cloud Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 265
Low-Rank Test-Time Training for Pre-Trained Point Cloud Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 266
STAR: Test-Time Adaptation Can Enhance Universal Prompt Learning for Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 267
Exploring Visual Pretraining for Learning Language Intelligence
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 268
VL-Eraser: Vacuum Distillation for Machine Unlearning in Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 269
DeAR: Fine-Grained VLM Adaptation by Decomposing Attention Head Roles
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 270
SynCLIP: Synonym-Coherent Language-Image Pretraining for Robust Open-Vocabulary Dense Perception
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 271
MODIX: A Training-Free Multimodal Information-Driven Positional Index Scaling for Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 272
VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 273
ORION: ORthonormal Text Encoding for Universal VLM AdaptatION
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 274
CASPA: Graph-Structured Concept Anchors for Modality-Agnostic Adaptation in Vision–Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 276
HOG-Layout: Hierarchical 3D Scene Generation, Optimization and Editing via Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 277
Towards Human-Like Robot Handwriting via Contour-Aware Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 278
MajutsuCity: Language-driven Aesthetic-adaptive City Generation with Controllable 3D Assets and Layouts
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 279
VectorArk: Learning Practical Image Vectorization with Rounded Polygon Representation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 280
OctoT2I: A Self-Evolving Agentic Text-to-Image Router
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 281
LottieGPT: Tokenizing Vector Animation for Autoregressive Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 282
SEA: Evaluating Sketch Abstraction Efficiency via Element-level Commonsense Visual Question Answering
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 283
Selective Amnesia using Contrastive Subnet Erasure for Class Level Unlearning in Vision Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 284
A Closed-Form Solution for Debiasing Vision-Language Models with Utility Guarantees Across Modalities and Tasks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 285
Rank-Guided Pseudo-Bias Learning for Robust Black-Box Adaptation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 286
Diagnosing and Repairing Unsafe Channels in Vision-Language Models via Causal Discovery and Dual-Modal Safety Subspace Projection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 287
WaTeRFlow: Watermark Temporal Robustness via Flow Consistency
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 288
DSO: Direct Steering Optimization for Bias Mitigation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 290
SineProject: Machine Unlearning for Stable Vision-Language Alignment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 291
HiLoRA: Hierarchical Low-Rank Adaptation for Personalized Federated Learning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 292
OS-Fed: One Snapshot Is All You Need
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 293
FedAlign: Differentially Private Distribution Alignment for Non-IID Federated Learning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 294
Guiding Diffusion Models with Fine-Grained Conditions and Semantics-Preserving Sampling for One-Shot Federated Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 295
Personalized Federated Training of Diffusion Models with Privacy Guarantees
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 297
Understanding Temporal Logic Consistency in Video-Language Models through Cross-Modal Attention Discriminability
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 298
Small Object, Great Challenge: A Benchmark for Small Object Visual Grounding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 299
UFVideo: Towards Unified Fine-Grained Video Cooperative Understanding with Large Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 300
ReMoRa: Multimodal Large Language Model based on Refined Motion Representation for Long-Video Understanding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 301
CaST-Bench: Benchmarking Causal Chain-Grounded Spatio-Temporal Reasoning for Video Question Answering
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 302
HERO: Hierarchical Embedding-Refinement for Open-Vocabulary Temporal Sentence Grounding in Videos
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 303
Scaling the Long Video Understanding of Multimodal Large Language Models via Visual Memory Mechanism
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 304
Hybrid Token Compression for Vision-Language Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 305
Focus, Don’t Prune: Identifying Instruction-Relevant Regions for Information-Rich Image Understanding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 306
When Token Pruning is Worse than Random: Understanding Visual Token Information in VLLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 307
VISion On Request: Enhanced VLLM efficiency with sparse, dynamically selected, vision-language interactions
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 308
BiGain: Unified Token Compression for Joint Generation and Classification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 309
Hi-Lo Prune: Look at What You'll Lose before Pruning with Hierarchical Token Selection
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 310
VLM-Pruner: Buffering for Spatial Sparsity in an Efficient VLM Centrifugal Token Pruning Paradigm
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 311
Bridge: Basis-Driven Causal Inference Marries VFMs for Domain Generalization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 312
In Pursuit of Pixel Supervision for Visual Pre-training
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 313
GaussianMatch: Semi-Supervised Regression with Pseudo-Label Filtering via Multi-View Gaussian Consistency
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 314
TAR: Token-Aware Refinement for Fine-grained Generalized Category Discovery
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 315
Semantic Noise Reduction via Teacher-Guided Dual-Path Audio-Visual Representation Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 316
The Universal Normal Embedding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 317
Bypassing the Transport Plan: Dynamic Reweighting for Out-of-Distribution Detection with Optimal Transport
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 318
Cross-domain Dual-stream Feature Disentanglement for Brain Disorder Prediction with Sparsely Labeled PET
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 319
Debiased Sample Selection for Learning with Noisy Labels
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 321
Open-Ended Instruction Realization with LLM-Enabled Multi-Planner Scheduling in Autonomous Vehicles
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 322
EE-RL: Vision Language Guided Reinforcement Learning with Explorer and Expert model for End-to-End Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 323
Sensor2Sensor: Cross-Embodiment Sensor Conversion for Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 324
SHARP: Short-Window Streaming for Accurate and Robust Prediction in Motion Forecasting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 325
DriveCombo: Benchmarking Compositional Traffic Rule Reasoning in Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 326
CausalVAD: De-confounding End-to-End Autonomous Driving via Causal Intervention
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 328
Learning to Drive is a Free Gift: Large-Scale Label-Free Autonomy Pretraining from Unposed In-The-Wild Videos
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 329
WhisperNet: A Scalable Solution for Bandwidth-Efficient Collaboration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 330
Efficient Equivariant Transformer for Self-Driving Agent Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 331
Generalizable Co-Salient Object Detection via Mixed Content-Style Modulation
[
Slides]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 332
Saliency-Driven Token Merging for Vision Transformers
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 333
RISE: Single Static Radar-based Indoor Scene Understanding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 334
Mixture-of-Experts based Feature Decoupling for Open Vocabulary Scene Graph Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 335
TF-SSD: A Strong Pipeline via Synergic Mask Filter for Training-free Co-salient Object Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 336
Denoise and Align: Towards Source-Free UDA for Robust Panoramic Semantic Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 337
SPOT: Spatiotemporal Prompt Optimization for Motion-Stabilized MLLM-Guided Video Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 338
Changes in Real Time: Online Scene Change Detection with Multi-View Fusion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 339
Subspace Alignment for CLIP-based Continual Learning via Canonical Correlation Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 340
DGS: Dual Gradient and Semantic-Shift Guided Low-Rank Adaptation for Class Incremental Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 341
Dynamic Magic: Unleashing Restricted Knowledge for Lifelong Person Re-Identification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 342
Which Concepts to Forget and How to Refuse? Decomposing Concepts for Continual Unlearning in Large Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 343
Temporal Imbalance of Positive and Negative Supervision in Class-Incremental Learning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 344
Forging a Dynamic Memory: Retrieval-Guided Continual Learning for Generalist Medical Foundation Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 345
Dance Across Shifts: Forward-Facilitation Continual Test-Time Adaptation through Dynamic Style Bridging
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 346
Few-Shot Hybrid Incremental Learning: Continually Learning under Data Scarcity and Task Uncertainty
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 347
High-Fidelity Mobile Avatars with Pruned Local Blendshapes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 348
PhysSkin: Real-Time and Generalizable Physics-Based Animation via Self-Supervised Neural Skinning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 349
Bridging Privacy and Provenance: Traceable Virtual Identity Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 350
PortraitDirector: A Hierarchical Disentanglement Framework for Controllable and Real-time Facial Reenactment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 351
Dynamic Label Noise Suppression with Optimal Teacher Pool for Facial Expression Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 352
MimicTalker: A Multimodal Interactive and Memory-Enhanced Framework for Real-Time Dyadic 3D Head Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 353
DecoVLN: Decoupling Observation, Reasoning, and Correction for Vision-and-Language Navigation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 354
HybridDriveVLA: Vision-Language-Action Model with Visual CoT reasoning and ToT Evaluation for Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 355
NavForesee: A Unified Vision-Language World Model for Hierarchical Planning and Dual-Horizon Navigation Prediction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 356
LookasideVLN: Direction-Aware Aerial Vision-and-Language Navigation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 359
FreeForm: Reduced-Order Deformable Simulation from Particle-Based Skinning Eigenmodes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 360
GeoDiff4D: Geometry-Aware Diffusion for 4D Head Avatar Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 361
4DEquine: Disentangling Motion and Appearance for 4D Equine Reconstruction from Monocular Video
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 362
PhysHO: Physics-Based Dynamic 3D Gaussian Human and Object from Monocular Video
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 363
ProgressiveAvatars: Progressive Animatable 3D Gaussian Avatars
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 364
ZINA: Multimodal Fine-grained Hallucination Detection and Editing
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 365
Mitigating Multimodal Hallucinations via Gradient-based Self-Reflection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 366
HalluGen: Synthesizing Realistic and Controllable Hallucinations for Evaluating Image Restoration
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 367
KVSmooth: Mitigating Hallucination in Multi-modal Large Language Models through Key-Value Smoothing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 368
ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Video Understanding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 369
Tell Model Where to Look: Mitigating Hallucinations in MLLMs by Vision-Guided Attention
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 370
Circular-DPO: Aligning Multi-Stage 3D Generative Models via Preference Feedback Loop
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 371
Cloning Deterministic Worlds: The Critical Role of Latent Geometry in Long-Horizon World Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 372
PrITTI: Primitive-based Generation of Controllable and Editable 3D Semantic Urban Scenes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 373
CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 374
ExPose: Reinforcing Video Generation Models for Extreme Pose Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 375
Choreographing a World of Dynamic Objects
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 376
SounDiT: Geo-Contextual Soundscape-to-Landscape Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 377
Vista4D: Video Reshooting with 4D Point Clouds
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 378
CamDirector: Towards Long-Term Coherent Video Trajectory Editing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 379
Elastic3D: Controllable Stereo Video Conversion with Guided Latent Decoding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 380
Decoupling Bias, Aligning Distributions: Synergistic Fairness Optimization for Deepfake Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 381
Target-Aware Invertible Encoder with Reconstruction Guidance for Infrared Small Target Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 382
BDNet:Bio-Inspired Dual-Backbone Small Object Detection Network
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 383
ElasticFormer: Detecting Objects in HRW Shots via Elastic Computing Vision Transformer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 384
RGB-Event based Pedestrian Attribute Recognition: A Benchmark Dataset and An Asymmetric RWKV Fusion Framework
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 385
FusionAgent: A Multimodal Agent with Dynamic Model Selection for Human Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 386
Free-Grained Hierarchical Visual Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 387
URICA: A Uniformity Region Affine Identifier Capture Algorithm for Arbitrary Region Retrieval in Pathology Images
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 389
DetAny4D: Detect Anything 4D Temporally in a Streaming RGB Video
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 390
Follow the Saliency: Supervised Saliency for Retrieval-augmented Dense Video Captioning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 391
Video-CoE: Reinforcing Video Event Prediction via Chain of Events
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 392
VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 393
VRR-QA: Visual Relational Reasoning in Videos Beyond Explicit Cues
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 394
Question-guided Visual Compression with Memory Feedback for Long-Term Video Understanding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 395
CURVE: A Benchmark for Cultural and Multilingual Long Video Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 396
SVBench: Evaluation of Video Generation Models on Social Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 397
Hierarchical Long Video Understanding with Audiovisual Entity Cohesion and Agentic Search
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 398
LifeEval: A Multimodal Benchmark for Assistive AI in Egocentric Daily Life Tasks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 399
Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 401
YOSE: You Only Select Essential Tokens for Efficient DiT-based Video Object Removal
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 402
CADC: Content Adaptive Diffusion-Based Generative Image Compression
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 403
FG-Portrait: 3D Flow Guided Editable Portrait Animation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 404
ResCa: Residual Caching for Diffusion Transformers Acceleration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 405
IP-Adapter Is All You Need: Towards Fine-Tuning-Free Diffusion-Based Talking Face Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 406
SRA 2: Variational Autoencoder Self-Representation Alignment for Efficient Diffusion Training
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 407
InnoAds-Composer: Efficient Condition Composition for E-Commerce Poster Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 409
SODA: Sensitivity-Oriented Dynamic Acceleration for Diffusion Transformer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 410
DSERT-RoLL: Robust Multi-Modal Perception for Diverse Driving Conditions with Stereo Event-RGB-Thermal Cameras, 4D Radar, and Dual-LiDAR
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 412
ReManNet: A Riemannian Manifold Network for Monocular 3D Lane Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 413
PanDA: Unsupervised Domain Adaptation for Multimodal 3D Panoptic Segmentation in Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 414
STUR3D: Spatio-Temporal Unified Representation Learning for 3D Object Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 415
Exploring 6D Object Pose Estimation with Deformation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 416
SearchAD: Large-Scale Rare Image Retrieval Dataset for Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 417
Improving Vision-language Models with Perception-centric Process Reward Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 418
X-PCR: A Benchmark for Cross-modality Progressive Clinical Reasoning in Ophthalmic Diagnosis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 420
PhysInOne: Visual Physics Learning and Reasoning in One Suite
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 421
AviaSafe: A Physics-Informed Data-Driven Model for Aviation Safety–Critical Cloud Forecasts
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 422
TTRV: Test-Time Reinforcement Learning for Vision Language Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 423
Reading or Reasoning? Format Decoupled Reinforcement Learning for Document OCR
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 424
QUANTIPHY: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 425
VisRes Bench: On Evaluating the Visual Reasoning Capabilities of VLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 426
TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 427
Urban-GS: A Unified 3D Gaussian Splatting Framework for Compact and High-Fidelity Aerial-to-Street Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 428
Generalizable Sparse-View 3D Reconstruction from Unconstrained Images
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 429
RemedyGS: Defend 3D Gaussian Splatting Against Computation Cost Attacks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 430
SparseCam4D: Spatio-Temporally Consistent 4D Reconstruction from Sparse Cameras
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 431
IDESplat: Iterative Depth Probability Estimation for Generalizable 3D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 432
GS^2: Graph-based Spatial Distribution Optimization for Compact 3D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 433
OnlinePG: Online Open-Vocabulary Panoptic Mapping with 3D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 434
Uni3R: Unified 3D Reconstruction and Semantic Understanding via Generalizable Gaussian Splatting from Unposed Multi-View Images
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 435
Learning Explicit Continuous Motion Representation for Dynamic Gaussian Splatting from Monocular Videos
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 436
MLLMSplat: A 2D MLLM-Powered Framework for 3D Gaussian Splatting Understanding, Generation, and Editing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 437
Dropping Anchor and Spherical Harmonics for Sparse-view Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 438
RAP: Fast Feedforward Rendering-Free Attribute-Guided Primitive Importance Score Prediction for Efficient 3D Gaussian Splatting Processing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 439
Plug-and-Play PDE Optimization for 3D Gaussian Splatting: Toward High-Quality Rendering and Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 440
PointGS: Semantic-Consistent Unsupervised 3D Point Cloud Segmentation with 3D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 442
Flow4DGS-SLAM: Optical Flow-Guided 4D Gaussian Splatting SLAM
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 443
Revisiting 3D Reconstruction Kernels as Low-Pass Filters
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 444
SR3R: Rethinking Super-Resolution 3D Reconstruction With Feed-Forward Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 445
GP-4DGS: Probabilistic 4D Gaussian Splatting from Monocular Video via Variational Gaussian Processes
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 446
VisRef: Visual Refocusing while Thinking Improves Test-Time Scaling in Multi-Modal Large Reasoning Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 447
IPR-1: Interactive Physical Reasoner
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 448
VIRO: Robust and Efficient Neuro-Symbolic Reasoning with Verification for Referring Expression Comprehension
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 450
Thinking in Dynamics: How Multimodal Large Language Models Perceive, Track, and Reason Dynamics in Physical 4D World
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 451
Latent Implicit Visual Reasoning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 452
Thinking with Programming Vision: Towards a Unified View for Thinking with Images
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 453
AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 454
All Roads Lead to Rome: Incentivizing Divergent Thinking in Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 455
See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 456
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 457
ReaGEN: Adaptive Generation of Structured Chains-of-Thought for Efficient Multimodal Reasoning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 458
Breaking the Regional Perception Bottleneck of Multimodal Large Language Models via External Reasoning Framework
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 459
CodePercept: Code-Grounded Visual STEM Perception for MLLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 460
TableMix: Enhancing Multimodal Table Reasoning in MLLMs from a Data-Centric Perspective
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 461
Harnessing Chain-of-Thought Reasoning in Multimodal Large Language Models for Face Anti-Spoofing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 462
Grounded Chain-of-Thought for Multimodal Large Language Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 464
SegMo: Co-Designing Content-Aware Sparsity and Locally-Cohesive Segment Parallelism for Efficient VLM Inference
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 466
Compressed-Domain-Aware Online Video Super-Resolution
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 468
Is Bin Generation Indispensable? A Bin-Generation-Free Dataset Quantization via Semantic Perspective
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 469
High Resolution Neural Video Coding with Bi-directional Confidence-Guided Reference Information Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 470
Distributed Image Compression with Multimodal Side Information at Extremely Low Bitrates
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 471
Task-Aware Image Signal Processor for Advanced Visual Perception
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 472
Enhancing Video Vision Language Model with Hippocampal Sensing
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 473
VIRD: View-Invariant Representation through Dual-Axis Transformation for Cross-View Pose Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 475
SoPE: Spherical Coordinate-Based Positional Embedding for Enhancing Spatial Perception of 3D LVLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 476
RHO: Robust Holistic OSM-Based Metric Cross-View Geo-Localization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 477
EfficientVPR: Toward Efficient Visual Place Recognition via Scene-Aware Prompt Tuning and Adaptive Feature Enhancement
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 478
Universal Guideline-Driven Image Clustering via a Hybrid LLM Agent
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 480
VideoChat-M1: Collaborative Policy Planning for Video Understanding via Multi-Agent Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 481
Think, Then Verify: A Hypothesis–Verification Multi-Agent Framework for Long Video Understanding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 482
Reinforce to Learn, Elect to Reason: A Dual Paradigm for Video Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 483
Graph-to-Frame RAG: Visual-Space Knowledge Fusion for Training-Free and Auditable Video Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 485
Multi-Modal Image Fusion via Intervention-Stable Feature Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 486
ReCoFuse: Ultra-Robust Image Fusion via Restorative Multi-Modal Diffusion Reciprocal Coupling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 487
Degradation-Robust Fusion: An Efficient Degradation-Aware Diffusion Framework for Multimodal Image Fusion in Arbitrary Degradation Scenarios
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 488
DF^2-VB: Dual-level Fuzzy Fusion with View-specific Boosting for Multi-view Multi-label Classification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 489
UniFusion: A Unified Image Fusion Framework with Robust Representation and Source-Aware Preservation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 490
Self-guided Semantic Inspection for Zero-Shot Composed Image Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 491
G-MIXER: Geodesic Mixup-based Implicit Semantic Expansion and Explicit Semantic Re-ranking for Zero-Shot Composed Image Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 492
No Hard Negatives Required: Concept Centric Learning Leads to Compositionality without Degrading Zero-shot Capabilities of Contrastive Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 493
MUSE: Harnessing Precise and Diverse Semantics for Few-Shot Whole Slide Image Classification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 494
Pointing at Parts: Training-Free Few-Shot Grounding in Multimodal LLMs
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 495
Graph Attention Prototypical Network for Robust Few-Shot Classification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 496
Mitigating The Distribution Shift of Diffusion-based Dataset Distillation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 497
EVLF: Early Vision-Language Fusion for Generative Dataset Distillation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 498
Fixed Anchors Are Not Enough: Dynamic Retrieval and Persistent Homology for Dataset Distillation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 499
Flow Map Distillation Without Data
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 500
F^2HDR: Two-Stage HDR Video Reconstruction via Flow Adapter and Physical Motion Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 501
Learning Latent Transmission and Glare Maps for Lens Veiling Glare Removal
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 502
Inter-Photon-Limited Videography
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 503
A Bit is All You Need! Efficient Video Capture via Single Bit Imaging
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 504
From Events to Clarity: The Event-Guided Diffusion Framework for Dehazing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 505
Electromagnetic Inverse Scattering from a Single Transmitter
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 506
Statistical Characteristic-Guided Denoising for Rapid High-Resolution Transmission Electron Microscopy Imaging
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 507
Physics-Guided Multistep Deformation Reversal for Ancient Bamboo Slip Restoration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 508
cryoSENSE: Compressive Sensing Enables High-throughput Microscopy with Sparse and Generative Priors on the Protein Cryo-EM Image Manifold
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 509
SGDE: Self-supervised Geometry Degradation Estimation Framework for Coded Aperture Compressive Spectral Imaging
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 510
Factorized Context Aggregation for Robust Cancer Risk Estimation via Soft Re-Ranked Retrieval and Hierarchical Anchors
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 511
UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 513
Depth Any Endoscopy: Towards Self-Supervised Generalizable Depth Estimation in Monocular Endoscopy
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 514
RoSAMDepth: Robust Self-supervised Depth Estimation Leveraging Segment Anything Model
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 515
AdaSFormer: Adaptive Serialized Transformers for Monocular Semantic Scene Completion from Indoor Environments
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 516
Dark3R: Learning Structure from Motion in the Dark
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 517
What Makes Good Synthetic Training Data for Zero-Shot Stereo Matching?
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 518
TR2M: Transferring Monocular Relative Depth to Metric Depth with Language Descriptions and Dual-Level Scale-Oriented Contrast
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 519
Iris: Integrating Language into Diffusion-based Monocular Depth Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 520
Ov3R: Open-Vocabulary Semantic 3D Reconstruction from RGB Videos
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 521
M3DLayout: A Multi-Source Dataset of 3D Indoor Layouts and Structured Descriptions for 3D Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 522
UniPart: Part-Level 3D Generation with Unified 3D Geom–Seg Latents
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 523
Photo3D: Advancing Photorealistic 3D Generation through Structure‑Aligned Detail Enhancement
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 524
Mesh-Pro: Asynchronous Advantage-guided Ranking Preference Optimization for Artist-style Quadrilateral Mesh Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 525
Order Matters: 3D Shape Generation from Sequential VR Sketches
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 526
Think-Then-Generate: Structural Chain-of-Thought Reasoning for Consistent 3D Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 527
ArtLLM: Generating Articulated Assets via 3D LLM
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 528
PoseMaster: A Unified 3D Native Framework for Stylized Pose Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 529
2D-LFM: Lifting Foundation Model without 3D Supervision
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 530
ActionMesh: Animated 3D Mesh Generation with Temporal 3D Diffusion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 531
4DWorldBench: A Comprehensive Evaluation Framework for 3D/4D World Generation Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 532
FabricGen: Microstructure-Aware Woven Fabric Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 533
Leveraging Verifier-Based Reinforcement Learning in Image Editing
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 534
PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 535
VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 536
MapReduce LoRA: Advancing the Pareto Front in Multi-Preference Optimization for Generative Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 537
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 538
C^2FG: Control Classifier-Free Guidance via Score Discrepancy Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 539
Learning What to Trust: Bayesian Prior-Guided Optimization for Visual Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 540
Unified Customized Generation by Disentangled Reward Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 541
Region-Aware Instance Consistency Learning for Micro-Expression Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 542
MPL: Match-guided Prototype Learning for Few-shot Action Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 543
LaDy: Lagrangian-Dynamic Informed Network for Skeleton-based Action Segmentation via Spatial-Temporal Modulation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 544
LA-Pose: Latent Action Pretraining Meets Pose Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 545
RAAS: LLM Agentic System Architecture Search with GRPO
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 546
Temporal Representation Enhancement (TRE): Learning to Forget Dominant Patterns for Enhanced Temporal Spiking Features
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 547
Chain-of-Models Pre-Training: Rethinking Training Acceleration of Vision Foundation Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 548
Unlocking Pre-trained Weights: Parameter Inheritance for Zero-Shot Initialization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 549
Deconstructing the Failure of Ideal Noise Correction: A Three-Pillar Diagnosis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 550
Progressive Neural Architecture Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 551
A Unified Framework for Knowledge Transfer in Bidirectional Model Scaling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 552
When Do Models Actually Decide? Mapping the Layer-Wise Decision Timeline in Pretrained Neural Networks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 553
Temporal Interaction in Spiking Transformers with Multi-Delay Mixer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 554
Consensus vs. Controversy: Mapping the Decision Space Where Architectures Diverge
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 555
Sparsely Timing the Change: A Spiking Temporal Framework for Remote Sensing Interpretation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 556
ProSoftArena: Benchmarking Hierarchical Capabilities of Multi-modal Agents in Professional Software Environments
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 557
BAMI: Training-Free Bias Mitigation in GUI Grounding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 558
DRS-GUI: Dynamic Region Search for Training-Free GUI Grounding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 559
Consistency Beyond Contrast: Enhancing Open-Vocabulary Object Detection Robustness via Contextual Consistency Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 560
Thermal-Det: Language-Guided Cross-Modal Distillation for Open-Vocabulary Thermal Object Detection
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 561
Geometry-driven OOD Detectors Are Class-Incremental Learners
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 562
Mind the Way You Select Negative Texts: Pursuing the Distance Consistency in OOD Detection with VLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 563
Prompt-Free Unknown Label Generation for Open World Detection in Remote Sensing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 564
Learning to Diversify and Focus: A Reinforcement Framework for Open-Vocabulary HOI Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 565
RINO: Rotation-Invariant Non-Rigid Correspondences
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 566
Hyperbolic Prototype Learning with Uncertainty-Aware Consistency for Continual Test-Time Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 567
DINO Eats CLIP: Adapting Beyond Knowns for Open-set 3D Object Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 568
Leveraging Class Distributions in CLIP for Weakly Supervised Semantic Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 569
CompetitorFormer: Mitigating Query Conflicts for 3D Instance Segmentation via Competitive Strategy
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 570
D2Dewarp: Dual Dimensions Geometric Representation Learning Based Document Image Dewarping
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 571
Discover, Segment, and Select: A Progressive Mechanism for Zero-shot Camouflaged Object Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 572
D-Convexity: A Unified Differentiable Convex Shape Prior via Quasi-Concavity for Data-driven Image Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 573
Fast Reasoning Segmentation for Images and Videos
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 574
Structure-Aware Representation Distillation for Tiny-Dense Object Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 575
CRFT: Consistent–Recurrent Feature Flow Transformer for Cross-Modal Image Registration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 576
FireScope: Wildfire Risk Raster Prediction With a Chain-of-Thought Oracle
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 577
OlmoEarth: Stable Latent Image Modeling for Multimodal Earth Observation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 578
TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 579
Regulating Rather than Constraining: Adaptive Guidance for Complex Spectral Reconstruction in Pansharpening
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 580
GeoMMBench and GeoMMAgent: Toward Expert-Level Multimodal Intelligence in Geoscience and Remote Sensing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 581
Revisiting the Necessity of Full Accuracy: Weakly Supervised Object-Level Offset Correction for Misaligned Building Labels
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 582
UniGeoSeg: Towards Unified Open-World Segmentation for Geospatial Scenes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 583
ZoomEarth: Active Perception for Ultra-High-Resolution Geospatial Vision-Language Tasks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 584
Unleashing Stealthy Backdoor Pandemic by Infecting a Single Diffusion Model
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 585
Taming the Long Tail: Rebalancing Adversarial Training via Adaptive Perturbation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 586
Robustness Under Data Scarcity: Few-Shot Continual Adversarial Training for Evolving Threats
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 587
Logit-Margin Repulsion for Backdoor Defense
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 588
Thermally Activated Dual-Modal Adversarial Clothing against AI Surveillance Systems
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 589
Immunizing Models Against Harmful Long-Horizon Fine-Tuning via Contractive Optimization Dynamics
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 590
Towards Stealthy and Effective Backdoor Attacks on Lane Detection: A Naturalistic Data Poisoning Approach
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 591
Red-teaming Retrieval-Augmented Diffusion Models via Poisoning Knowledge Bases
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 592
Latent Diffusion Inversion Requires Understanding the Latent Space
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 593
Fractal Camouflage: A Bio-Inspired Approach for Multi-Scale Adversarial Attacks in the Infrared Domain
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 594
EgoRoC: Towards Egocentric Robotic Control via Task-Agnostic Visual Alignment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 595
Describe Anything Anywhere At Any Moment
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 596
StaMo: Unsupervised Learning of Generalizable Robot Motion from Compact State Representation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 597
VLA Models Are More Generalizable Than You Think: Revisiting Physical and Spatial Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 598
Action–Geometry Prediction with 3D Geometric Prior for Bimanual Manipulation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 599
Joint-Aligned Latent Action: Towards Scalable VLA Pretraining in the Wild
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 600
Rethinking Camera Choice: An Empirical Study on Fisheye Camera Properties in Robotic Manipulation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 601
INSIGHT Bench: Towards Grounded IN-SItu Guidance for Robotic ManipulaTion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 602
MM-ACT: Learn from Multimodal Parallel Generation to Act
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 603
HQC-NBV: A Hybrid Quantum-Classical View Planning Approach
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 604
Motus: A Unified Latent Action World Model
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 605
SE(3)-Equivariance with Geometric and Topological Guidance for Category-Level Object Pose Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 606
SPEAR-1: Scaling Beyond Robot Demonstrations via 3D Understanding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 607
Global Prior Meets Local Consistency: Dual-Memory Augmented Vision-Language-Action Model for Efficient Robotic Manipulation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 608
RoboTAG: End-to-end Robot Pose Estimation via Topological Alignment Graph
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 609
MVLM: Template-Free Tracking via Vision–Language Margin Confidence and Memory-Gated Tracking
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 610
Interactive Tracking: A Human-in-the-Loop Paradigm with Memory-Augmented Adaptation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 611
VidEoMT: Your ViT is Secretly Also a Video Segmentation Model
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 612
Matching Every Pair to Track Every Point: PairFormer for All-Pairs Tracking and Video Trajectory Fields
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 613
Boosting Self-Supervised Tracking with Contextual Prompts and Noise Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 614
Progressive Multi-cue Alignment for Unaligned RGBT Tracking
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 615
Real-Time Neural Video Compression with Unified Intra and Inter Coding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 616
Adapting Lightweight Image-based Counting Models for Video Crowd Counting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 618
MedTVT-R1: A Multimodal LLM Empowering Medical Reasoning and Diagnosis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 619
MedKCO: Medical Vision-Language Pretraining via Knowledge-Driven Cognitive Orchestration
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 620
Toward Generalizable Whole Brain Representations with High-Resolution Light-Sheet Data
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 621
CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 622
GenTract: Generative Global Tractography
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 623
LUMINA: A Multi-Vendor Mammography Benchmark with Energy Harmonization Protocol
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 624
Virtual Immunohistochemistry Staining with Dual-Aligned Multi-Task Feature Guidance
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 625
Can Natural Image Autoencoders Compactly Tokenize fMRI Volumes for Long-Range Dynamics Modeling?
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 626
IEBGL:An Interpretability-Enhanced Brain Graph Learning Framework with LLM-Instructed Topology and Literature-Augmented Semantics
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 627
F^2-Assist: Multi-Phase Fetal Growth Forecast and Report Generation from Ultrasound Examination
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 628
Sparse Spectral LoRA: Routed Experts for Medical VLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 629
SAT-RRG: LLM-Guided Self-Adaptive Training for Radiology Report Generation with Token-Level Push–Pull Optimization
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 630
OralGPT-Plus: Learning to Use Visual Tools via Reinforcement Learning for Panoramic X-ray Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 632
Forensic-Friendly Image Manipulation via Controllable Latent Diffusion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 633
IncreFA: Breaking the Static Wall of Generative Model Attribution
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 634
AVFakeBench: A Comprehensive Audio-Video Forgery Detection Benchmark for AV-LMMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 635
Detecting Compressed AI-Generated Images via Phase Spectrum Robustness
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 636
Detect Any AI-Counterfeited Text Image
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 637
DeepfakeImpact: A Two-Stage Benchmark with Real-World Impact in Deepfake Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 638
Enhancing the Security of Visual Speaker Authentication Based on Dynamic Lip-Print Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 639
SimLBR: Learning to Detect Fake Images by Learning to Detect Real Images
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 640
Editprint: General Digital Image Forensics via Editing Fingerprint with Self-Augmentation Training
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 641
Detecting AI-Generated Forgeries via Iterative Manifold Deviation Amplification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 642
Goldilocks Test Sets for Face Verification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 643
Fine-VAD: Towards Fine-Grained Video Anomaly Detection via Progressive Cross-Granularity Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 644
DLVP-CLIP: Enhancing Fine-Grained Zero-Shot Anomaly Detection via Dynamic Local Visual Prompting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 645
MoECLIP: Patch-Specialized Experts for Zero-shot Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 646
Alert-CLIP: Abnormality-aware Latent-Enhanced Representation Tuning of CLIP for Video Anomaly Detection
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 647
AnomalyVFM -- Transforming Vision Foundation Models into Zero-Shot Anomaly Detectors
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 649
Bidirectional Multimodal Prompt Learning with Scale-Aware Training for Few-Shot Multi-Class Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 650
GS-CLIP: Zero-shot 3D Anomaly Detection by Geometry-Aware Prompt and Synergistic View Representation Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 651
TLMA: Mitigating the Impact of Weakly Labeled Information for Video Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 652
Defect Cue-Preserved Structural Feature Refinement for Few-Shot Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 653
Anomaly-Related Residual Fields for Cross-domain Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 654
From Attraction to Equilibrium: Physics-Inspired Semantic Gravitons for Zero-Shot Anomaly Detection
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 655
Joint Learning of General and Diverse Patterns with Mixture of Memory Experts for Weakly-Supervised Video Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 656
No Need For Real Anomaly: MLLM Empowered Zero-Shot Video Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 657
FB-CLIP: Fine-Grained Zero-Shot Anomaly Detection with Foreground-Background Disentanglement
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 658
DynamicVGGT: Learning Dynamic Point Maps for 4D Scene Reconstruction in Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 659
GenieDrive: Towards Physics-Aware Driving World Model with 4D Occupancy Guided Video Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 660
Test-Time 3D Occupancy Prediction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 661
Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 663
dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 664
RegionRoute: Regional Style Transfer with Diffusion Model
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 665
Low-Rank Residual Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 666
RDF-MIG: A Robust Diffusion Framework for Masked Image Generation to Augment Semantic Segmentation and Change Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 667
TC-Padé: Trajectory-Consistent Padé Approximation for Diffusion Acceleration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 668
Bi-directional Autoregressive Diffusion for Large Complex Motion Interpolation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 669
Guiding Token-Sparse Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 670
Accelerating Diffusion-based Video Editing via Heterogeneous Caching: Beyond Full Computing at Sampled Denoising Timestep
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 672
High-Fidelity Virtual Try-On beyond Paired Data Scarcity via Diffusion-based Cycle-Consistent Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 673
Sampling-Aware Quantization for Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 674
CRAFT: Aligning Diffusion Models with Fine-Tuning Is Easier Than You Think
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 675
Scale Space Diffusion
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 676
Making Training-Free Diffusion Segmentors Scale with the Generative Power
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 677
Roots Beneath the Cut: Uncovering the Risk of Concept Recovery in Pruning-Based Unlearning for Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 678
Few-Step Diffusion Sampling Through Instance-Aware Discretizations
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 679
SpeeDiff: Scalable Pixel-Anchored End-to-End Latent Diffusion Model
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 680
Structure-to-Intensity Diffusion for Adverse-Weather LiDAR Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 681
Focal–General Diffusion Model with Semantic Consistent Guidance for Sign Language Production
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 682
Diffusion Probe: Generated Image Result Prediction Using CNN Probes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 683
Content-Aware Dynamic Patchification for Efficient Video Diffusion
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 684
PixelRush: Ultra-Fast, Training-Free High-Resolution Image Generation via One-step Diffusion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 685
Diffusion-Based sRGB Real Noise Generation via Prompt-Driven Noise Representation Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 686
Decoupled Residual Denoising Diffusion Models for Unified and Data Efficient Image-to-Image Translation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 687
GROW: Watermark Generation with Progressive Guidance for Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 688
MotionV2V: Editing Motion in a Video
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 689
Mind the Generative Details: Direct Localized Detail Preference Optimization for Video Diffusion Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 690
OrthoFuse: Training-free Riemannian Fusion of Orthogonal Style-Concept Adapters for Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 691
DreamStyle: A Unified Framework for Video Stylization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 692
Diffusion Sampling Path Tells More: An Efficient Plug-and-Play Strategy for Sample Filtering
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 693
Designing Instance-Level Sampling Schedules via REINFORCE with James-Stein Shrinkage
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 694
Reward Sharpness-Aware Fine-Tuning for Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 695
DBMSolver: A Training-free Diffusion Bridge Sampler for High-Quality Image-to-Image Translation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 696
Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 697
TAP: A Token-Adaptive Predictor Framework for Training-Free Diffusion Acceleration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 698
Cross-modal Representation Learning for Diffusion-generated Image Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 699
Sparse-LaViDa: Sparse Multimodal Discrete Diffusion Language Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 700
Back to Basics: Let Denoising Generative Models Denoise
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 701
CaricHarmony: Contrastive Diffusion Paths for Identity-Preserving Caricature Synthesis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 702
DiP: Taming Diffusion Models in Pixel Space
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 703
RAPID: Reusing Attention Sparsity with Inter-step Adaptation for Efficient Video Diffusion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 704
Efficient and Training-Free Single-Image Diffusion Models
[
Poster]
Poster Session
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F None
Poster Session 5 & Exhibit Hall
Art Program
Sun Jun 07 10:45 AM -- 02:00 PM (PDT) @ ExHall F None
Art Exhibition
Art Program
Sun Jun 07 10:45 AM -- 11:15 AM (PDT) @ ExHall F None
Art Gallery Tour with Curator and Artists
Oral
Sun Jun 07 01:00 PM -- 01:15 PM (PDT) @ Mile High Ballroom 3A - 4A None
Efficient Unrolled Networks for Large-Scale 3D Inverse Problems
Oral
Sun Jun 07 01:00 PM -- 01:12 PM (PDT) @ Mile High Ballroom 1A - 2A None
CURE: Curriculum-guided Multi-task Training for Reliable Anatomy Grounded Report Generation
Oral
Sun Jun 07 01:00 PM -- 01:15 PM (PDT) @ Bluebird Ballroom None
Differentiable Laplacian Matrix Guided Superpixel Segmentation
Oral
Sun Jun 07 01:00 PM -- 01:15 PM (PDT) @ Four Seasons Ballroom None
CineBrain: A Large-Scale Multi-Modal Audiovisual Brain Dataset for Brain-Conditioned Video Generation
Oral Session
Sun Jun 07 01:00 PM -- 02:15 PM (PDT) @ Bluebird Ballroom None
Oral Session 6A: Geometric Learning
Oral Session
Sun Jun 07 01:00 PM -- 02:15 PM (PDT) @ Mile High Ballroom 1A - 2A None
Oral Session 6C: Medical Vision
Oral Session
Sun Jun 07 01:00 PM -- 02:15 PM (PDT) @ Four Seasons Ballroom None
Oral Session 6B: Multimodal Reasoning
Oral Session
Sun Jun 07 01:00 PM -- 02:15 PM (PDT) @ Mile High Ballroom 3A - 4A None
Oral Session 6D: Large-Scale Neural Modeling
Oral
Sun Jun 07 01:12 PM -- 01:25 PM (PDT) @ Mile High Ballroom 1A - 2A None
DK-DDIL: Adaptive Knowledge Retention for Dynamic Domain-Incremental Learning in Medical Imaging
Oral
Sun Jun 07 01:15 PM -- 01:30 PM (PDT) @ Bluebird Ballroom None
FILTR: Extracting Topological Features from Pretrained 3D Models
Oral
Sun Jun 07 01:15 PM -- 01:30 PM (PDT) @ Mile High Ballroom 3A - 4A None
FedAdamom: Adaptive Momentum for Improved Generalization in Federated Optimization
Oral
Sun Jun 07 01:15 PM -- 01:30 PM (PDT) @ Four Seasons Ballroom None
Hearing the Room Through the Shape of the Drum: Modal-Guided Sound Recovery from Multi-Point Surface Vibrations
Oral
Sun Jun 07 01:25 PM -- 01:37 PM (PDT) @ Mile High Ballroom 1A - 2A None
Dual-level Adapter Boosting Prompt-free Curvilinear Structure Segmentation
Oral
Sun Jun 07 01:30 PM -- 01:45 PM (PDT) @ Bluebird Ballroom None
Learning Convex Decomposition via Feature Fields
Oral
Sun Jun 07 01:30 PM -- 01:45 PM (PDT) @ Four Seasons Ballroom None
SDTrack: A Baseline for Event-based Tracking via Spiking Neural Networks
Oral
Sun Jun 07 01:30 PM -- 01:45 PM (PDT) @ Mile High Ballroom 3A - 4A None
SimScale: Learning to Drive via Real-World Simulation at Scale
Oral
Sun Jun 07 01:37 PM -- 01:50 PM (PDT) @ Mile High Ballroom 1A - 2A None
LATA: Laplacian-Assisted Transductive Adaptation for Conformal Uncertainty in Medical VLMs
Oral
Sun Jun 07 01:45 PM -- 02:00 PM (PDT) @ Mile High Ballroom 3A - 4A None
Texvent: Asynchronous Event Data Simulation via Text Prompt
Oral
Sun Jun 07 01:45 PM -- 02:00 PM (PDT) @ Bluebird Ballroom None
Learning Eigenstructures of Unstructured Data Manifolds
Oral
Sun Jun 07 01:45 PM -- 02:00 PM (PDT) @ Four Seasons Ballroom None
Thinking with Drafts: Speculative Temporal Reasoning for Efficient Long Video Understanding
Oral
Sun Jun 07 01:50 PM -- 02:02 PM (PDT) @ Mile High Ballroom 1A - 2A None
Medic-AD: Towards Medical Vision-Language Model's Clinical Intelligence
Poster Setup
Sun Jun 07 02:00 PM -- 02:30 PM (PDT) @ ExHall A None
Poster Setup
Oral
Sun Jun 07 02:00 PM -- 02:15 PM (PDT) @ Four Seasons Ballroom None
Wan-Weaver: Interleaved Multi-modal Generation via Decoupled Training
Oral
Sun Jun 07 02:00 PM -- 02:15 PM (PDT) @ Mile High Ballroom 3A - 4A None
WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World
Oral
Sun Jun 07 02:00 PM -- 02:15 PM (PDT) @ Bluebird Ballroom None
Mapping Networks
Oral
Sun Jun 07 02:02 PM -- 02:15 PM (PDT) @ Mile High Ballroom 1A - 2A None
SegMoTE: Token-Level Mixture of Experts for Medical Image Segmentation
Break
Sun Jun 07 02:15 PM -- 02:30 PM (PDT) None
Courtesy Break
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 1
Differentiable Laplacian Matrix Guided Superpixel Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 2
FILTR: Extracting Topological Features from Pretrained 3D Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 3
Learning Convex Decomposition via Feature Fields
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 4
Learning Eigenstructures of Unstructured Data Manifolds
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 6
CineBrain: A Large-Scale Multi-Modal Audiovisual Brain Dataset for Brain-Conditioned Video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 7
Hearing the Room Through the Shape of the Drum: Modal-Guided Sound Recovery from Multi-Point Surface Vibrations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 8
SDTrack: A Baseline for Event-based Tracking via Spiking Neural Networks
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 9
Thinking with Drafts: Speculative Temporal Reasoning for Efficient Long Video Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 10
Wan-Weaver: Interleaved Multi-modal Generation via Decoupled Training
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 11
CURE: Curriculum-guided Multi-task Training for Reliable Anatomy Grounded Report Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 12
DK-DDIL: Adaptive Knowledge Retention for Dynamic Domain-Incremental Learning in Medical Imaging
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 13
Dual-level Adapter Boosting Prompt-free Curvilinear Structure Segmentation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 14
LATA: Laplacian-Assisted Transductive Adaptation for Conformal Uncertainty in Medical VLMs
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 15
Medic-AD: Towards Medical Vision-Language Model's Clinical Intelligence
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 16
SegMoTE: Token-Level Mixture of Experts for Medical Image Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 17
Efficient Unrolled Networks for Large-Scale 3D Inverse Problems
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 18
FedAdamom: Adaptive Momentum for Improved Generalization in Federated Optimization
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 19
SimScale: Learning to Drive via Real-World Simulation at Scale
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 20
Texvent: Asynchronous Event Data Simulation via Text Prompt
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 21
WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 22
BuildingGPT: Auto-Regressive Building Wireframe Reconstruction Model with Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 23
Emergent Extreme-View Geometry in 3D Foundation Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 24
LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 25
LASER: Layer-wise Scale Alignment for Training-Free Streaming 4D Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 26
PanoVGGT: Feed-Forward 3D Reconstruction from Panoramic Imagery
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 27
Rascene: High-Fidelity 3D Scene Imaging with mmWave Communication Signals
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 28
VGG-T^3: Offline Feed-Forward 3D Reconstruction at Scale
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 30
OmniVGGT: Omni-Modality Driven Visual Geometry Grounded Transformer
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 32
HeSS: Head Sensitivity Score for Sparsity Redistribution in VGGT
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 33
Dense Metric Depth Completion from Sparse Direct Time-of-Flight Sensors
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 34
Online3R: Online Learning for Consistent Sequential Reconstruction Based on Geometry Foundation Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 35
Neu-PiG: Neural Preconditioned Grids for Fast Dynamic Surface Reconstruction on Long Sequences
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 36
Learning 3D Reconstruction with Priors in Test Time
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 37
ArchSym: Detecting 3D-Grounded Architectural Symmetries in the Wild
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 38
PointTPA: Dynamic Network Parameter Adaptation for 3D Scene Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 39
tttLRM: Test-Time Training for Long Context and Autoregressive 3D Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 40
Hint2Gen: Bridging Understanding and Generation via Code-structured Hints
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 41
Compositional Text-to-Image Generation Via Region-aware Bimodal Direct Preference Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 42
Learning by Analogy: A Causal Framework for Compositional Generalization
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 43
ID-Crafter: VLM-Grounded Online RL for Compositional Multi-Subject Video Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 44
GenColorBench: A Color Evaluation Benchmark for Text-to-Image Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 45
Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 46
When Pretty Isn’t Useful: Investigating Why Modern Text-to-Image Models Fail as Reliable Training Data Generators
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 47
TempoControl: Temporal Attention Guidance for Text-to-Video Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 48
Hear What Matters! Text-conditioned Selective Video-to-Audio Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 49
MultiCrafter: High-Fidelity Multi-Subject Generation via Disentangled Attention and Identity-Aware Preference Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 51
DiffGraph: An Automated Agent-driven Model Merging Framework for In-the-Wild Text-to-Image Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 52
Gloria: Consistent Character Video Generation via Content Anchors
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 53
DreamShot: Personalized Storyboard Synthesis with Video Diffusion Prior
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 54
M4V: Multimodal Mamba for Efficient Text-to-Video Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 55
Property-Informed Diffusion-Based Text-to-Microstructure Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 56
DreamingComics: A Story Visualization Pipeline via Subject and Layout Customized Generation using Video Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 57
Mixture of States: Routing Token-Level Dynamics for Multimodal Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 58
HiCoGen: Hierarchical Compositional Text-to-Image Generation in Diffusion Models via Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 59
TherA: Thermal-Aware Visual-Language Prompting for Controllable RGB-to-Thermal Infrared Translation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 60
See What I Mean: Aligning Vision and Language Representations for Video Fine-grained Object Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 61
CoV-Align: Efficient Fine-grained Cross-Modal Alignment with Cohesive Visual Semantics Priority
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 62
TDATR: Improving End-to-End Table Recognition via Table Detail-Aware Learning and Cell-Level Visual Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 63
A Mixed Diet Makes DINO An Omnivorous Vision Encoder
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 64
Uncertainty-guided Compositional Alignment with Part-to-Whole Semantic Representativeness in Hyperbolic Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 65
TaskForce: Cooperative Multi-agent Reinforcement Learning for Multi-task Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 66
PhyCritic: Multimodal Critic Models for Physical AI
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 67
R-C2: Cycle-Consistent Reinforcement Learning Improves Multimodal Reasoning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 68
Multimodal RewardBench 2: Evaluating Omni Reward Models for Interleaved Text and Image
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 69
Unified Generation and Self-Verification for Vision-Language Models via Advantage Decoupled Preference Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 70
Anchoring the Mind of Multimodal Reasoners: Cognitive Bias as a Vector for Jailbreak Attacks
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 71
InsCal: Calibrated Multi-Source Fully Test-Time Prompt Tuning for Object Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 72
Why Not Hyperparameter-Friendly Optimisation? A Monotonic Adaptive Norm Rescaling Approach For Long-Tailed Recognition
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 73
Decoupling Vision and Language: Codebook Anchored Visual Adaptation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 74
MemFlow: A Lightweight Forward Memorizing Framework for Quick Domain Adaptive Feature Mapping
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 75
Mind the Discriminability Trap in Source-Free Cross-domain Few-shot Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 76
Vision-Language Model Guided Source-Free Domain Adaptation via Optimal Transport
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 77
Masked Representation Modeling for Domain-Adaptive Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 78
TaskIT: Memory-Efficient Fine-Tuning of Multi-LoRA LLMs via Cross-Task Importance Transfer
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 79
ARES: Unifying Asymmetric RGB-Event Stereo for Probabilistic Scene Flow Estimation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 80
MER-Tracker: Towards High-Speed 3D Point Tracking via Multi-View Event-RGB Hybrid Cameras
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 81
Moving Border Ownership for Event-based Motion Segmentation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 82
TTAPFormer: Robust Arbitrary Point Tracking via Transient Asynchronous Fusion of Frames and Events
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 83
EventHub: Data Factory for Generalizable Event-Based Stereo Networks without Active Sensors
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 84
Seeing Motion Through Polarity for Event-based Action Recognition
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 85
Multi-Scale Gaussian-Language Map for Zero-shot Embodied Navigation and Reasoning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 86
Explore with Long-term Memory: A Benchmark and Multimodal LLM-based Reinforcement Learning Framework for Embodied Exploration
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 87
SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 88
TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 89
AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 90
Experience Transfer for Multimodal LLM Agents in Minecraft Game
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 91
MSGNav: Unleashing the Power of Multi-modal 3D Scene Graph for Zero-Shot Embodied Navigation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 92
SaPaVe: Towards Active Perception and Manipulation in Vision-Language Action Models for Robotics
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 93
MANSION: Multi-floor lANguage-to-3D Scene generatIOn for loNg-horizon tasks
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 94
RealAppiance: Let High-fidelity Appliance Assets Controllable and Workable as Aligned Real Manauls
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 95
ForeAct: Steering Your VLA with Efficient Visual Foresight Planning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 96
Affordance Field Intervention: Enabling VLAs to Escape Memory Traps in Robotic Manipulation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 97
MERIT: Multi-domain Efficient RAW Image Translation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 98
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 99
Probabilistic Prompt Adaptation for Unified Image Aesthetics and Quality Assessment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 100
EMMA: Concept Erasure Benchmark with Comprehensive Semantic Metrics and Diverse Categories
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 101
Too Vivid to Be Real? Benchmarking and Calibrating Generative Color Fidelity
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 102
WiseEdit: Benchmarking Cognition- and Creativity-Informed Image Editing
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 103
UnicEdit-10M: A Dataset and Benchmark Breaking the Scale-Quality Barrier via Unified Verification for Reasoning-Enriched Edits
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 104
Inter-Edit: First Benchmark for Interactive Instruction-Based Image Editing
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 105
PR-IQA: Partial-Reference Image Quality Assessment for Diffusion-Based Novel View Synthesis
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 106
LumiMotion: Improving Gaussian Relighting with Scene Dynamics
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 107
Let it Snow! Animating 3D Gaussian Scenes with Dynamic Weather Effects via Physics-Guided Score Distillation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 108
iLRM: An Iterative Large 3D Reconstruction Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 109
MVInverse: Feed-forward Multiview Inverse Rendering in Seconds
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 110
From None to All: Self-Supervised 3D Reconstruction via Novel View Synthesis
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 111
MoRel: Long-Range Flicker-Free 4D Motion Modeling via Anchor Relay-based Bidirectioanl Blending with Hierarchical Densification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 112
Multi-view Pyramid Transformer: Look Coarser to See Broader
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 113
CaT-GS: Efficient 3DGS Rendering for Large Scale Scenes via Inter-frame Caching and Tile Scheduling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 114
RL‑ScanIQA: Reinforcement-Learned Scanpaths for Blind 360° Image Quality Assessment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 115
Benchmarking Endoscopic Surgical Image Restoration and Beyond
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 116
SDUIE: Semi-Supervised Diffusion for Underwater Image Enhancement with Quant-Text Dual Control
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 117
HiDRA: Hierarchical Degradation Representation and Adaptation with Generative Priors for Enhancing Infrared Vision
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 118
BluRef: Unsupervised Image Deblurring with Dense-Matching References
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 119
Bi-Bridge: Bidirectional Diffusion Bridges for Low-Light Image Enhancement
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 120
UniLDiff: Unlocking the Power of Diffusion Priors for All-in-One Image Restoration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 121
MatAnyone 2: Scaling Video Matting via a Learned Quality Evaluator
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 122
SelfHVD: Self-Supervised Handheld Video Deblurring
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 123
Spatio-Temporal Difference Guided Motion Deblurring with the Complementary Vision Sensor
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 124
Learning Where to Look and How to Judge: Resolution-agnostic Image Quality Assessment with Quality-aware Saliency
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 125
Bridging RGB and Hematoxylin Components: An Interleaved Guidance and Fusion Framework for Point Supervised Nuclei Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 126
Virtual Nodes Guided Dynamic Graph Neural Network for Brain Tumor Segmentation with Missing Modalities
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 127
VoxTell: Free-Text Promptable Universal 3D Medical Image Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 128
Photo-Guided Tooth Segmentation on 3D Oral Scan Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 129
Breaking the Continuum: Discrete Distribution Learning for Structural MRI Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 130
Uni-Hema: Unified Model for Digital Hematopathology
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 131
Post-training Feature Pruning for Fundus Images Classification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 132
Sketch2CT: Multimodal Diffusion for Structure-Aware 3D Medical Volume Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 133
SafeLogo: Turning Your Logos into Jailbreak Shields via Micro-Regional Adversarial Training
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 134
Anti-I2V: Safeguarding your Photos from Malicious Image-to-video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 135
UniGame: Turning a Unified Multimodal Model Into Its Own Adversary
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 136
Hierarchically Robust Zero-shot Vision-language Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 137
Beyond Text Prompts: Precise Concept Erasure through Text–Image Collaboration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 138
AGENTSAFE: Benchmarking the Safety of Embodied Agents on Hazardous Instructions
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 139
ReMoE: Region-Mixture Experts for Adversarially-Robust Vision Transformers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 140
TreeTeaming: Autonomous Red-Teaming of Vision-Language Models via Hierarchical Strategy Exploration
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 141
SO-Bench: A Structural Output Evaluation of Multimodal LLM
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 142
Chain-of-Thought Guided Multi-Modal Object Re-Identification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 143
When Lines Meet Textures: Spatial-Frequency Aligned Diffusion Features for Cross-Sparsity Correspondence
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 144
CountGD++: Generalized Prompting for Open-World Counting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 145
AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 146
Parameter-Efficient Adaptation for MLLMs via Implicit Modality Decomposition
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 147
Hyperbolic Gramian Volumes for Multimodal Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 148
Venus: Benchmarking and Empowering Multimodal Large Language Models for Aesthetic Guidance and Cropping
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 149
AutoCut: End-to-end advertisement video editing based on multimodal discretization and controllable generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 150
StableMTL: Repurposing Latent Diffusion Models for Multi-Task Learning from Partially Annotated Synthetic Datasets
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 151
CaReFlow: Cyclic Adaptive Rectified Flow for Multimodal Fusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 152
Lenses: Toward Polysemous Vision–Language Understanding
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 153
CoRiM: Conflict-driven Risk Minimization for Dynamic Multimodal Fusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 154
Uncertainty-Aware Exploratory Direct Preference Optimization for Multimodal Large Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 155
CICA: Coupling Confidence-Aware Pretraining with Confidence-Informed Attention for Robust Multimodal Sentiment Analysis
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 156
SAMTok: Representing Any Mask with Two Words
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 157
Multi-Metric Representation Learning Strategy Based on Clustering for Fine-Grained Multimodal Sentiment Analysis
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 159
MMSD3.0: A Multi-Image Benchmark for Real-World Multimodal Sarcasm Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 160
Anchor-Guided Gradient Alignment for Incomplete Multimodal Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 161
PyraTok: Language-Aligned Pyramidal Tokenizer for Video Understanding and Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 162
VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 163
Kontinuous Kontext: Continuous Strength Control for Instruction-based Image Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 164
VideoCoF: Unified Video Editing with Temporal Reasoner
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 165
Progressive Supernet Training for Efficient Visual Autoregressive Modeling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 166
CoT-Edit: Let CoT Guide Instruction Video Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 168
Test-Time Instance-Specific Parameter Composition: A New Paradigm for Adaptive Generative Modeling
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 169
Understanding, Accelerating, and Improving MeanFlow Training
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 170
Meta-CoT: Enhancing Granularity and Generalization in Image Editing
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 171
Dual-Granularity Memory for Efficient Video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 172
Unified Camera Positional Encoding for Controlled Video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 173
EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 175
PLACID: Identity-Preserving Multi-Object Compositing via Video Diffusion with Synthetic Trajectories
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 176
Object-WIPER: Training-Free Object and Associated Effect Removal in Videos
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 177
Mobile-VTON: High-Fidelity On-Device Virtual Try-On
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 178
Progress by Pieces: Test-Time Scaling for Autoregressive Image Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 179
Towards Robust Sequential Decomposition for Complex Image Editing
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 180
Layer Consistency Matters: Elegant Latent Transition Discrepancy for Generalizable Synthetic Image Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 181
Chain of Event-Centric Causal Thought for Physically Plausible Video Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 182
LoL: Longer than Longer, Scaling Video Generation to Hour
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 183
FlowMotion: Training-Free Flow Guidance for Video Motion Transfer
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 184
Learning Straight Flows: Variational Flow Matching for Efficient Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 185
SIGMA: Selective-Interleaved Generation with Multi-Attribute Tokens
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 186
DNF-SR: Dual-Input and Negative-Aware Feature Fine-Tuning for Real-World Image Super-Resolution
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 187
IFCSR: Inference-Free Fidelity-Realism Control for One-Step Diffusion-based Real-World Image Super-Resolution
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 188
Edge-Focused Super-Resolution for Omnidirectional Images with Spherical Geometric Augmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 189
TUDSR: Twice Upsampling-Diffusion for Higher Super-Resolution
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 190
PS-SR: Pseudo-Single-Step Video Super-Resolution via Speculative Diffusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 191
Disentangled Textual Priors for Diffusion-based Image Super-Resolution
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 192
Remote Sensing Image Super-Resolution for Imbalanced Textures: A Texture-Aware Diffusion Framework
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 193
Rethinking Diffusion Model-Based Video Super-Resolution: Leveraging Dense Guidance from Aligned Features
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 194
DreamSR: Towards Ultra-High-Resolution Image Super-Resolution via a Receptive-Field Enhanced Diffusion Transformer
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 195
FiDeSR: High-Fidelity and Detail-Preserving One-Step Diffusion Super-Resolution
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 196
STCDiT: Spatio-Temporally Consistent Diffusion Transformer for High-Quality Video Super-Resolution
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 197
Towards Highly-Constrained Human Motion Generation with Retrieval-Guided Diffusion Noise Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 198
Learning to Control Physically-simulated 3D Characters via Generating and Mimicking 2D Motions
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 199
Human Geometry Distribution for 3D Animation Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 200
A Temporal and Content Co-Awareness Latent Diffusion for Controllable Hand Image Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 201
Superman: Unifying Skeleton and Vision for Human Motion Perception and Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 202
Learning to Assist: Physics-Grounded Human-Human Control via Multi-Agent Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 203
Stability-Driven Motion Generation for Object-Guided Human-Human Co-Manipulation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 204
Causal Motion Diffusion Models for Autoregressive Motion Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 205
Towards Storytelling Animations: Joint Synthesis of Human and Camera Motions
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 206
MoLingo: Motion–Language Alignment for Text-to-Human Motion Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 207
End-to-End Language-Action Model for Humanoid Whole Body Control
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 208
Toward Early Quality Assessment of Text-to-Image Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 209
CoD: A Diffusion Foundation Model for Image Compression
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 210
Diffusion MRI Transformer with a Diffusion Space Rotary Positional Embedding (D-RoPE)
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 211
Language-Guided One-Step Diffusion Model for Nighttime Flare Removal
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 212
SpiralDiff: Spiral Diffusion with LoRA for RGB-to-RAW Conversion Across Cameras
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 213
PnP-CM: Consistency Models as Plug-and-Play Priors for Inverse Problems
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 214
Landscape-Awareness for Geometric View Diffusion Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 215
Otil: Accelerating Diffusion Model Inference via Communication-Efficient Multi-GPU Parallelism
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 216
REACH: Explicit Recovery Behavior for Diffusion Policies
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 217
OralGPT-Omni: A Versatile Dental Multimodal Large Language Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 218
CrossHOI-Bench: A Unified Benchmark for HOI Evaluation across Vision-Language Models and HOI-Specific Methods
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 219
The LLM Bottleneck: Why Open-Source Vision LLMs Struggle with Hierarchical Visual Recognition
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 220
Do Vision-Language Models Measure Up? Benchmarking Visual Measurement Reading with MeasureBench
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 222
Beyond Single Images: A Comprehensive Benchmark for Album-Level Vision-Language Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 223
LIBERO-Plus: A Progressive Robustness Benchmark for Visual-Language-Action Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 224
Scenes as Tokens: Multi-Scale Normal Distributions Transform Tokenizer for General 3D Vision–Language Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 226
Hear you are: Teaching LLMs Spatial Reasoning with Vision and Spatial Sound
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 227
EgoMind: Activating Spatial Cognition through Linguistic Reasoning in MLLMs
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 228
SAQN: Semantic-based Adaptive Query Network for 3D Referring Expression Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 229
EagleVision: A Dual-Stage Framework with BEV-grounding-based Chain-of-Thought for Spatial Intelligence
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 230
Abstract 3D Perception for Spatial Intelligence in Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 231
PV-Ground: Text-Guided Point-Voxel Interaction for 3D Visual Grounding
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 232
Masking Matters: Unlocking the Spatial Reasoning Capabilities of LLMs for 3D Scene-Language Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 233
SpatialStack: Layered Geometry-Language Fusion for 3D VLM Spatial Reasoning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 235
PARSE: Part-Aware Relational Spatial Modeling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 237
MCHDoc: A Comprehensive Benchmark for Reading Multi-Carrier Chinese Historical Documents
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 238
Cross-modal Fuzzy Alignment Network for Text-Aerial Person Retrieval and A Large-scale Benchmark
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 239
CodeMMR: Bridging Natural Language, Code, and Image for Unified Retrieval
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 240
DiT-Distill: Open-Set Fine-Grained Retrieval via Generative Curriculum Knowledge
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 241
ReCALL: Recalibrating Capability Degradation for MLLM-based Composed Image Retrieval
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 242
Love Me, Love My Label: Rethinking the Role of Labels in Prompt Retrieval for Visual In-Context Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 243
Rethinking BCE Loss for Multi-Label Image Recognition with Fine-Tuning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 244
CAST: Context-Aware Dynamic Latent Space Transformation for Interactive Text-to-Image Retrieval
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 245
PriVi: Towards a General-Purpose Video Model for Primate Behavior in the Wild
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 246
Seeing Conversations: Communication Context Identification in Egocentric Video
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 247
Interactive Episodic Memory with User Feedback
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 248
Seeing without Pixels: Perception from Camera Trajectories
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 249
PFGNet: A Fully Convolutional Frequency-Guided Peripheral Gating Network for Efficient Spatiotemporal Predictive Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 250
Minerva-Ego: Spatiotemporal Hints for Egocentric Video Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 251
StreamRAG: Enhancing Real-Time Video Understanding with Retrieval Augmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 252
ViKey: Enhancing Temporal Understanding in Videos via Visual Prompting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 253
SkillSight: Efficient First-Person Skill Assessment with Gaze
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 254
BriMA: Bridged Modality Adaptation for Multi-Modal Continual Action Quality Assessment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 255
Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 256
MedLIME: A Distribution-Aligned and Evidence-Supported Framework for Medical Saliency Explanations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 257
Inside-Out: Measuring Generalization in Vision Transformers Through Inner Workings
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 258
Language Models Can Explain Visual Features via Steering
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 259
Making the Classification Explanation Faithful to the Confidence Score
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 260
Intrinsic Concept Extraction Based on Compositional Interpretability
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 261
Attribution-Guided Model Rectification of Unreliable Neural Network Behaviors
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 262
Measuring the (Un)Faithfulness of Concept-Based Explanations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 263
Deformation-based In-Context Learning for Point Cloud Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 265
ESAM++: Efficient Online 3D Perception on the Edge
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 266
DualReg: Dual-Space Filtering and Reinforcement for Rigid Registration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 268
Rethinking 2D-3D Registration: A Novel Network for High-Value Zone Selection and Representation Consistency Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 269
Adaptive 3D Perception for Small Aerial Targets Under Sparse Sampling via Reinforcement Learning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 270
3D sans 3D Scans: Scalable Pre-training from Video-Generated Point Clouds
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 271
StreamVLO: Streaming Visual–LiDAR Odometry with Cumulative Drift Compensation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 272
Mamba Learns in Context: Structure-Aware Domain Generalization for Multi-Task Point Cloud Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 273
Routing on Demand: DSNet for Efficient Progressive Point Cloud Denoising
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 274
Hyper-PCN: Hypergraph-Based Point Cloud Completion via High-Order Correlation Modeling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 275
Towards Calibrating Prompt Tuning of Vision- Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 276
DEVA: Fine-tuning Multimodal Large Language Models for Visual Perception Tasks
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 277
LOREAL: Mitigating Low-Resolution Challenges in Vision-Language Models with Attribute-driven Prompt Self-Distillation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 278
OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 279
Language-guided Frequency Modulation for Large Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 280
TANGO: Text-Anchored Guided Optimization for Robust Fine-tuning Vision-Language Models under Label Noise
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 281
Cluster-Wise Spatio-Temporal Masking for Efficient Video-Language Pretraining
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 282
Reconstructing CLIP for Open-Vocabulary Dense Perception
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 283
DPL: Decoupled Prototype Learning for Enhancing Robustness of Vision–Language Transformers to Missing Modalities
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 284
BrepVGAE: Variational Graph Autoencoder with Unified Latent Representation for B-rep
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 285
NeuROK: Generative 4D Neural Object Kinematics
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 286
BrickNet: Graph-Backed Generative Brick Assembly
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 287
Unified Vector Floorplan Generation via Markup Representation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 288
CME-CAD: Heterogeneous Collaborative Multi-Expert Reinforcement Learning for CAD Code Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 289
Robo-SGG: Exploiting Layout-Oriented Normalization and Restitution Can Improve Robust Scene Graph Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 290
OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 291
EpiAgent: An Agent-Centric System for Ancient Inscription Restoration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 294
Image-based Outlier Synthesis With Training Data
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 295
SALMUBench: A Benchmark for Sensitive Association-Level Multimodal Unlearning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 297
When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 298
IrisFP: Adversarial-Example-based Model Fingerprinting with Enhanced Uniqueness and Robustness
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 299
Mark4D: Temporally-Consistent Watermarking for 4D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 300
Machine Unlearning via Adaptive Gradient Reweighting and Multi-stage Objective Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 301
Taming Noise-Induced Prototype Degradation for Privacy-Preserving Personalized Federated Fine-Tuning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 302
FedMOP: Achieving Enhanced Privacy and Performance in Federated Learning via Momentum Orthogonal Projection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 303
HFedATM: Hierarchical Federated Domain Generalization via Optimal Transport and Regularized Mean Aggregation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 304
Single-Round Scalable Analytic Federated Learning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 305
Controllable Federated Prompt Learning at Test Time
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 307
Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 308
Spatial Matters: Position-Guided 3D Referring Expression Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 309
Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 310
Refer-Agent: A Collaborative Multi-Agent System with Reasoning and Reflection for Referring Video Object Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 311
CaptionFormer: Unified Segmentation, Tracking, and Captioning for Spatio-Temporal Objects
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 312
TransPrune: Token Transition Pruning for Efficient Large Vision-Language Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 313
QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 314
Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 315
Collaborative Multi-Mode Pruning for Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 316
ZOO-Prune: Training-Free Token Pruning via Zeroth-Order Gradient Estimation in Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 317
HAWK: Head Importance-Aware Visual Token Pruning in Multimodal Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 318
CORE: Compact Object-centric REpresentations as a New Paradigm for Token Merging in LVLMs
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 319
Imbalanced View Contribution Evaluation and Refinement for Deep Incomplete Multi-View Clustering
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 320
Multi-Hierarchical Contrastive Spectral Fusion for Multi-View Clustering
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 321
SECOS: Semantic Capture for Rigorous Classification in Open-World Semi-Supervised Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 322
Multi-Modal Representation Learning via Semi-Supervised Rate Reduction for Generalized Category Discovery
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 323
TimeBridge: Self-Supervised Video Representation Learning via Start-End Joint Embedding and In-Between Frame Prediction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 324
Mitigating Instance Entanglement in Instance-Dependent Partial Label Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 325
Residual Connections Harm Generative Representation Learning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 326
Neural Mixture Density Processes
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 327
Large-scale Robust Enhanced Ensemble Clustering via Outlier Decoupling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 328
DriveLaW: Unifying Planning and Video Generation in a Latent Driving World
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 329
DLWM: Dual Latent World Models enable Holistic Gaussian-centric Pre-training in Autonomous Driving
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 330
Latent Chain-of-Thought World Modeling for End-to-End Driving
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 332
TrafficAlign: Aligning Large Language Models for Traffic Scenario Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 333
Failure Modes for Deep Learning–Based Online Mapping: How to Measure and Address Them
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 334
Linking Modality Isolation in Heterogeneous Collaborative Perception
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 335
LEAD: Minimizing Learner-Expert Asymmetry in End-to-End Driving
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 336
DriverGaze360: OmniDirectional Driver Attention with Object-Level Guidance
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 337
Diffusion Forcing Planner: History-Annealed Planning with Time-Dependent Guidance for Autonomous Driving
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 338
DIMOS: Disentangling Instance-level Moving Object Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 339
EvObj: Learning Evolving Object-centric Representations for 3D Instance Segmentation without Scene Supervision
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 340
Live Interactive Training for Video Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 341
Robust Promptable Video Object Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 342
Scene-VLM: Multimodal Video Scene Segmentation via Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 343
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 344
BEV-CAR: Enhancing Monocular Bird’s Eye View Segmentation with Context-Aware Rasterization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 345
Exploring the Underwater World Segmentation without Extra Training
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 347
Cross-Architecture Adaptation: Cloud-Edge Continual Test-Time Adaptation with Dynamic Sampling and Heterogeneous Distillation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 348
Towards Dynamic Modality Alignment in Multimodal Continual Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 350
Incremental Object Detection via Future-Aware Decoupled Cross-Head Distillation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 351
Smart Replay: Adaptive Scheduling of Memory Rehearsal for Computational Resource-Aware Incremental Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 352
ReBaPL: Repulsive Bayesian Prompt Learning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 353
Spectral Mixture-of-Experts for Continual Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 354
ActAvatar: Temporally-Aware Precise Action Control for Talking Avatars
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 355
ViBES: A Conversational Agent with Behaviorally-Intelligent 3D Virtual Body
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 356
DeX-Portrait: Disentangled and Expressive Portrait Animation via Explicit and Latent Motion Representations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 357
SketchFaceGS: Real-Time Sketch-Driven Face Editing and Generation with Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 358
MIBURI: Towards Expressive Interactive Gesture Synthesis
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 359
Personalized Image Descriptions from Attention Sequences
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 360
GA-VLN: Geometry-Aware BEV Representation for Efficient Vision-Language Navigation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 361
IMAIA: Interactive Maps AI Assistant for Travel Planning and Geo-Spatial Intelligence
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 362
OctoNav: Towards Generalist Embodied Navigation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 363
WalkGPT: Grounded Vision–Language Conversation with Depth-Aware Segmentation for Pedestrian Navigation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 364
SpaceDrive: Infusing Spatial Awareness into VLM-based Autonomous Driving
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 365
SMAP: Semantic Route Planning with Map-Grounded Multimodal Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 367
Fresco: Frequency–Spatial Consistent Optimization for Fine-Grained Head Avatar Modeling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 368
Motion-Aware Animatable Gaussian Avatars Deblurring
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 369
ELITE: Efficient Gaussian Head Avatar from a Monocular Video via Learned Initialization and Test-time Generative Adaptation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 370
Multi-view Consistent 3D Gaussian Head Avatars 'without' Multi-view Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 371
MAD: Modality-Adaptive Decoding for Mitigating Cross-Modal Hallucinations in Multimodal Large Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 372
Cross-Modal Attention Calibration for LVLM Hallucination Mitigation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 373
3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 374
Exposing and Evaluating Hallucinations for GUI Grounding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 375
Understanding and Mitigating Hallucinations in Multimodal Chain-of-Thought Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 376
Beyond the Global Scores: Fine-Grained Token Grounding as a Robust Detector of LVLM Hallucinations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 377
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 378
Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 379
AniMimic: Imitating 3D Animation from Video Priors
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 380
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 381
ScenDi: 3D-to-2D Scene Diffusion Cascades for Urban Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 382
MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 383
GeodesicNVS: Probability Density Geodesic Flow Matching for Novel View Synthesis
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 384
WorldStereo: Bridging Controllable Video Generation and Scene Reconstruction via 3D Geometric Memories
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 385
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 386
Taming Video Models for 3D and 4D Generation via Zero-Shot Camera Control
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 387
Improving Motion in Image-to-Video Models via Adaptive Low-Pass Guidance
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 388
SANER: Switchable Adapter with Non-parametric Enhanced Routing for Person De-Reidentification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 389
BIT: Matching-based Bi-directional Interaction Transformation Network for Visible-Infrared Person Re-Identification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 390
Vision-Language Attribute Disentanglement and Reinforcement for Lifelong Person Re-Identification
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 391
Diversity over Uniformity: Rethinking Representation in Generated Image Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 392
Mining Instance-Centric Vision–Language Contexts for Human–Object Interaction Detection
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 393
FSLoRA: Harmonizing Detection and Re-Identification via Freq-Spatial Low-Rank Adapter for One-Stage Person Search
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 394
EEGiT: Teaching Vision Transformers to Understand the EEG signal
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 395
FedBPrompt: Federated Domain Generalization Person Re-Identification via Body Distribution Aware Visual Prompts
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 396
Pose-guided Enriched Feature Learning for Federated-by-camera Person Re-identification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 397
UAV-CB: A Complex-Background RGB–T Dataset and Local Frequency Bridge Network for UAV Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 398
TimeViper: A Hybrid Mamba-Transformer Vision-Language Model for Efficient Long Video Understanding
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 400
LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 401
Agentic Video Summarization via Self-Reflecting Multimodal Understanding
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 402
Self-Critical Distillation Network for Video-based Commonsense Captioning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 403
Ego-Grounding for Personalized Question-Answering in Egocentric Videos
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 404
AdaSpark: Adaptive Sparsity for Efficient Long-Video Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 405
EarlyTom: Early Token Compression Completes Fast Video Understanding
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 406
VideoWorld 2: Learning Transferable Knowledge from Real-world Videos
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 407
VirtueBench: Evaluating Trustworthiness under Uncertainty in Long Video Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 408
DiverseDiT: Towards Diverse Representation Learning in Diffusion Transformers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 409
RenderFlow: Single-Step Neural Rendering via Flow Matching
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 410
ResDiT: Evoking the Intrinsic Resolution Scalability in Diffusion Transformers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 411
Masked Region Transformer for Layered Image Generation and Editing at Scale
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 412
DDT: Decoupled Diffusion Transformer
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 413
Just-in-Time: Training-Free Spatial Acceleration for Diffusion Transformers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 414
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 415
ShapeAR: Generating Editable Shape Layers via Autoregressive Diffusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 417
RecTok: Reconstruction Distillation along Rectified Flow
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 418
EgoXtreme: A Dataset for Robust Object Pose Estimation in Egocentric Views under Extreme Conditions
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 419
CoIn3D: Revisiting Configuration-Invariant Multi-Camera 3D Object Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 420
H^2A^2: Homogeneity-Aware and Heterogeneity-Aware Feature Perception for Unified Indoor 3D Object Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 421
Cov2Pose: Leveraging Spatial Covariance for Direct Manifold-aware 6-DoF Object Pose Estimation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 422
Towards Intrinsic-Aware Monocular 3D Object Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 423
SToRe3D: Sparse Token Relevance in ViTs for Efficient Multi-View 3D Object Detection
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 424
SPAN: Spatial-Projection Alignment for Monocular 3D Object Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 425
DSCA: Dynamic Subspace Concept Alignment for Lifelong VLM Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 426
FailureAtlas: Mapping the Failure Landscape of T2I Models via Active Exploration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 427
HDR-VLM: HDR-Domain Adaptation of VLMs and Preference-Aligned Quality Assessment for HDR Video Color Grading
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 428
RobustVisRAG: Causality-Aware Vision-Based Retrieval-Augmented Generation under Visual Degradations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 429
BiomedCCPL: Causal Conditional Prompt Learning for Biomedical Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 430
DynamicGTR: Leveraging Graph Topology Representation Preferences to Boost VLM Capabilities on Graph QAs
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 431
VisualOverload: Probing Visual Understanding of VLMs in Really Dense Scenes
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 432
Revisiting Visual Corruptions in LVLMs: A Shape–Texture Perspective on Model Failures
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 433
From Intuition to Investigation: A Tool-Augmented Reasoning MLLM Framework for Generalizable Face Anti-Spoofing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 434
Trust-calibrated Collaborative Learning for Long-Tailed Visual Recognition
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 435
SunFaded: Illumination-Aware Gaussian Splatting for Dark Scenes with Camera-Mounted Active Lighting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 436
TokenSplat: Token-aligned 3D Gaussian Splatting for Feed-forward Pose-free Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 437
GOR-IS: 3D Gaussian Object Removal In the Intrinsic Space
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 438
AeroGS: Scale-Aware Gaussian Splatting for Pose-Free Dynamic UAV Scene Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 439
Intrinsic Geometry-Appearance Consistency Optimization for Sparse-View Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 440
AERGS-SLAM: Auto-Exposure-Robust Stereo 3D Gaussian Splatting SLAM
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 441
Learning Differentiable Hierarchies in 3D Gaussian Splatting
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 442
WeatherCity: Urban Scene Reconstruction with Controllable Multi-Weather Transformation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 443
Cross-View Splatter: Feed-Forward View Synthesis with Georeferenced Images
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 444
TagSplat: Topology-Aware Gaussian Splatting for Dynamic Mesh Modeling and Tracking
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 445
Hierarchical Visual Relocalization with Nearest View Synthesis from Feature Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 446
Tracking-Guided 4D Generation: Foundation-Tracker Motion Priors for 3D Model Animation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 447
3D Gaussian Splatting from Unposed Spike Stream
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 448
SparseOIT: Improving Order-Independent Transparency 3DGS via Active Set Method
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 449
ClipGStream: Clip-Stream Gaussian Splatting for Any Length and Any Motion Multi-View Dynamic Scene Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 450
Space-Time Forecasting of Dynamic Scenes with Motion-aware Gaussian Grouping
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 451
MoRGS: Efficient Per-Gaussian Motion Reasoning for Streamable Dynamic 3D Scenes
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 452
BEA-GS: BEyond RAdiance Supervision in 3DGS for Precise Object Extraction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 453
EDGS: Eliminating Densification for Efficient Convergence of 3DGS
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 454
ReasonMap: Towards Fine-Grained Visual Reasoning from Transit Maps
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 455
Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 456
DialogueVPR: Towards Conversational Visual Place Recognition
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 457
Perceptual-Evidence Anchored Reinforced Learning for Multimodal Reasoning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 458
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 459
VinQA: Visual Elements Interleaved Long-form Answer Generation for Real-World Multimodal Document QA
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 460
DocSeeker: Structured Visual Reasoning with Evidence Grounding for Long Document Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 462
VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 463
Grounding Everything in Tokens for Multimodal Large Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 464
Evolving Contextual Safety in Multi-Modal Large Language Models via Inference-Time Self-Reflective Memory
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 465
ChartR: Evaluating Reasoning Accuracy and Robustness in Chart Question Answering
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 466
Think Visually, Reason Textually: Vision-Language Synergy in Abstract Reasoning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 467
VKG-QA: Visual Knowledge Graph-based Question Answer for Large Multimodal Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 468
Med-CMR: A Fine-Grained Benchmark Integrating Visual Evidence and Clinical Logic for Medical Complex Multimodal Reasoning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 470
VITAL: Vision-Encoder-centered Pre-training for LMMs in Visual Quality Assessment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 471
Generative Video Compression with One-Dimensional Latent Representation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 472
Markovian Scale Prediction: A New Era of Visual Autoregressive Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 473
Learned Image Compression via Sparse Attention and Adaptive Frequency
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 474
UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 475
VecAttention: Vector-wise Sparse Attention for Accelerating Long Context Inference
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 476
Ultra-Fast Neural Video Compression
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 477
Parallax to Align Them All: An OmniParallax Attention Mechanism for Distributed Multi-View Image Compression
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 478
Scaling Parallel Sequence Models to Vision Foundation Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 479
Revisiting Model Stitching In the Foundation Model Era
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 480
GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 481
VLM-Loc: Localization in Point Cloud Maps via Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 482
HOLO: Homography-Guided Pose Estimator Network for Fine-Grained Visual Localization on SD Maps
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 483
TriLite: Efficient Weakly Supervised Object Localization with Universal Visual Features and Tri-Region Disentanglement
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 484
GeoSURGE: Geo-localization using Semantic Fusion with Hierarchy of Geographic Embeddings
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 486
OVOD-Agent: A Markov–Bandit Framework for Proactive Visual Reasoning and Self-Evolving Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 487
Pixel2Phys: Distilling Governing Laws from Visual Dynamics
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 489
Seeing as Experts Do: A Knowledge-Augmented Agent for Open-Set Fine-Grained Visual Understanding
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 490
Dynamic Important Example Mining for Reinforcement Finetuning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 491
Specificity-aware reinforcement learning for fine-grained open-world classification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 492
SAGE: Training Smart Any-Horizon Agents for Long Video Reasoning with Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 493
Uncertainty-Aware Modality Fusion for Unaligned RGB-T Salient Object Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 494
Fusion in Your Way: Aligning Image Fusion with Heterogeneous Demands via Direct Preference Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 495
More Than Meets the Eye: A Unified Image Fusion Framework via Semantic-Pixel Entropy Trade-off for Zero-Shot Generalization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 496
Beyond Sequential Tools: A Unified VLM Agent System for Photographic Post-Processing via Dynamic Multi-Expert Fusion
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 497
Multi-modal Frequency Decomposition Network for Semantic Scene Completion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 498
BiEvLight: Bi-level Learning of Task-Aware Event Refinement for Low-Light Image Enhancement
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 499
FusionRegister: Every Infrared and Visible Image Fusion Deserves Registration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 500
OmniFood8K: Single-Image Nutrition Estimation via Hierarchical Frequency-Aligned Fusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 501
Enhancing Unregistered Hyperspectral Image Super-Resolution via Unmixing-based Abundance Fusion Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 502
LRHDR: Learning Representation-enhanced HDR Video Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 503
Cross-Domain Few-Shot Segmentation via Multi-view Progressive Adaptation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 504
Interpretable Cross-Domain Few-Shot Learning with Rectified Target-Domain Local Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 505
PP-Brep: Few-Shot B-rep Classification with Hybrid Graph Representation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 506
AgentDet: A Shared-Blackboard Multi-Agent Framework for Zero-/Few-Shot Object Detection
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 507
SFR-Net: Steering-Fusion-Refining Network in Multi-label Zero-Shot Sewer Defect Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 508
Noise-Aware Few-Shot Learning through Bi-directional Multi-View Prompt Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 510
Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 511
Progressive Mask Distillation for Self-supervised Video Representation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 512
HierAmp: Coarse-to-Fine Autoregressive Amplification for Generative Dataset Distillation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 513
SpiderCam: Low-Power Snapshot Depth from Differential Defocus
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 514
Computational Speckle Pattern Interferometry
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 515
DetectSCI: Toward Object-Guided ROI Reconstruction for High-Resolution Video Snapshot Compressive Imaging
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 516
Solving a Nonlinear Blind Inverse Problem for Tagged MRI with Physics and Deep Generative Priors
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 517
Nonlinear Color Transfer via Learnable Bezier Flows
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 518
VT-Intrinsic: Physics-Based Decomposition of Reflectance and Shading using a Single Visible-Thermal Image Pair
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 520
Computer Vision with a Superpixelation Camera
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 521
Color-Encoded Illumination for High-Speed Volumetric Scene Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 522
Multi-Scale Gradient-Guided Unrolling Architecture with Adaptive Mamba for Compressive Sensing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 523
Deciphering Genotype-Phenotype Mechanisms from High-Content Profiling via Knowledge-Guided Multi-modal Graph Learning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 524
Bulk RNA-seq Guided Multi-modal Detection of Anomalous Regions in Human Cancer via Spatial Transcriptomics
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 525
Intervention-Aware Multiscale Representation Learning from Imaging Phenomics and Perturbation Transcriptomics
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 526
ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information Interaction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 528
PromptLoop: Plug-and-Play Prompt Refinement via Latent Feedback for Diffusion Model Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 529
EvoID: Reinforced Evolution for Identity-Preserving Video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 530
Masked Auto-Regressive Variational Acceleration: Fast Inference Makes Practical Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 531
PhyCo: Learning Controllable Physical Priors for Generative Motion
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 532
Unified Multimodal Models as Auto-Encoders
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 533
Expand and Prune: Maximizing Trajectory Diversity for Effective GRPO in Generative Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 535
Drainage: A Unifying Framework for Addressing Class Uncertainty
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 536
Neural Differentiation in Deep Networks: A Theoretical Framework for Expressivity and Representational Diversity
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 537
DuetMerging: Synergizing Dynamic and Static Strategies for Mitigating Task Interference in Model Merging
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 538
SASNet: Spatially-Adaptive Sinusoidal Networks for INRs
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 539
Generative Modeling of Weights: Generalization or Memorization?
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 540
Vision-Oriented Lightweight Neural Architecture Search with Budget-Adaptive Evaluation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 542
Stepwise Credit Assignment for GRPO on Flow-Matching Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 543
FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 546
Image-to-Point Cloud Feature Back-Projection for Multimodal Training of 3D Semantic Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 547
NG-GS: NeRF-guided 3D Gaussian Splatting Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 548
Teaching DINOv3 About Partial 3D Geometry: A Self-Supervised Geometry-Aware Approach
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 549
SemLayer: Semantic-aware Generative Segmentation and Layer Construction for Abstract Icons
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 550
MatchED: Crisp Edge Detection Using End-to-End, Matching-based Supervision
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 551
SegGBC: Justifiable Coarse-to-Fine Granular-Ball Computing for Enhancing Clustering Image Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 552
Seeing Beyond: Extrapolative Domain Adaptive Panoramic Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 553
MatchMask: Mask-Centric Generative Data Augmentation for Label-Scarce Semantic Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 554
Boundary-Responsive Differentiable Gating for Superpixel-Based Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 555
Task-Oriented Data Synthesis and Control-Rectify Sampling for Remote Sensing Semantic Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 556
FUSAR-GPT: A Spatiotemporal Feature-Embedded and Two-Stage Decoupled Visual Language Model for SAR Imagery
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 557
UniChange: Unifying Change Detection with Multimodal Large Language Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 558
Spatiotemporal Pyramid Flow Matching for Climate Emulation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 559
See What We Cannot See: A Geo-guided Reasoning Benchmark for Object Counting under Adverse Earth Observation Conditions
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 560
MM-OVSeg: Multimodal Optical–SAR Fusion for Open-Vocabulary Segmentation in Remote Sensing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 561
RECS4R: Bridging Semantics and Geometry for Referring Remote Sensing Interpretation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 562
Fourier Angle Alignment for Oriented Object Detection in Remote Sensing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 563
Learning to Infer Parameterized Representations of Plants from 3D Scans
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 564
Good Can Sometimes be Bad: A Unified Attack against 3D Point Cloud Classifier by a Flexible Isotropic Resampling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 565
V-Attack: Targeting Disentangled Value Features for Controllable Adversarial Attacks on LVLMs
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 566
FeatureFool: Zero-Query Fooling of Video Models via Feature Map
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 567
RankOOD - Class Ranking-based Out-of-Distribution Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 568
AdvFM: Lookahead Flow-Matching Velocity-Field Attacks for Imperceptible and Transferable Adversarial Examples
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 569
The Power of Decaying Steps: Enhancing Attack Stability and Transferability for Sign-based Optimizers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 570
Your Classifier Can Do More: Towards Balancing the Gaps in Classification, Robustness, and Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 571
Learning Mutual View Information Graph for Adaptive Adversarial Collaborative Perception
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 572
Hierarchical Attacks for Multi‑Modal Multi‑Agent Reasoning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 573
Omni-Attack: Adversarial Attacks on Open-Ended VQA in Black-Box Multimodal LLMs
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 574
CoMo: Learning Continuous Latent Motion from Internet Videos for Scalable Robot Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 575
Δynamics: Language-Based Representation for Inferring Rigid-Body Dynamics From Videos
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 576
PvP: Data-Efficient Humanoid Robot Learning with Proprioceptive-Privileged Contrastive Representations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 577
Diagnose, Correct, and Learn from Manipulation Failures via Visual Symbols
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 579
GeCo-SRT: Geometry-aware Continual Adaptation for Cross-Task Sim-to-Real Transfer
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 580
ActiveGrasp: Information-Guided Active Grasping with Calibrated Energy-based Model
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 581
BiPreManip: Learning Affordance-Based Bimanual Pre-Manipulation through Anticipatory Collaboration
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 582
Learning Surgical Robotic Manipulation with 3D Spatial Priors
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 583
SimRecon: SimReady Compositional Scene Reconstruction from Real Videos
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 584
STRNet: Visual Navigation with Spatio-Temporal Representation through Dynamic Graph Aggregation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 585
RaUF: Learning the Spatial Uncertainty Field of Radar
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 586
SIR: Structured Image Representations for Explainable Robot Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 587
Instance-level Visual Active Tracking with Occlusion-Aware Planning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 588
Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 589
AnthroTAP: Learning Point Tracking with Real-World Motion
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 590
Tracking by Predicting 3-D Gaussians Over Time
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 591
Toward Low-Cost yet Effective Temporal Learning for UAV Tracking
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 592
Rethinking Two-Stage Referring-by-Tracking in Referring Multi-Object Tracking: Make it Strong Again
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 593
Occlusion-Aware SORT: Observing Occlusion for Robust Multi-Object Tracking
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 594
CoWTracker: Tracking by Warping instead of Correlation
[
Slides]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 595
Learning Long-term Motion Embeddings for Efficient Kinematics Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 596
SpatialVID: A Large-Scale Video Dataset with Spatial Annotations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 597
Beyond Explicit Language: Plug-and-Play Visual-to-Linguistic Modeling Toward General Object Tracking
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 598
FairLLaVA: Fairness-Aware Parameter-Efficient Fine-Tuning for Large Vision-Language Assistants
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 599
InvCoSS: Inversion-driven Continual Self-supervised Learning in Medical Multi-modal Image Pre-training
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 600
PETAR: Localized Findings Generation with Mask-Aware Vision-Language Modeling for PET Automated Reporting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 601
From Panel to Pixel: Zoom-In Vision–Language Pretraining from Biomedical Scientific Literature
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 602
LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 603
D2T2 - Multimodal Automated Planning for Brachytherapy
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 604
TopoCL: Topological Contrastive Learning for Medical Imaging
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 605
Diffusion with a Linguistic Compass: Steering the Generation of Clinically Plausible Future sMRI Representations for Early MCI Conversion Prediction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 606
Personalized Longitudinal Medical Report Generation via Temporally-Aware Federated Adaptation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 607
Decoding 3D Perception via BrainSSD: Synergistic Fusion of EEG Representations from Static and Dynamic Visual Streams
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 608
Duala: Dual-Level Alignment of Subjects and Stimuli for Cross-Subject fMRI Decoding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 609
OmniBrainBench: A Comprehensive Multimodal Benchmark for Brain Imaging Analysis Across Multi-stage Clinical Tasks
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 610
Beyond Pixel Simulation: Pathology Image Generation via Diagnostic Semantic Tokens and Prototype Control
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 611
MedFG-VQA: Low-Frequency Memory and Graph Attention for Lightweight Medical VQA
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 612
FISHuman: Fine-grained Single-image 3D Human Reconstruction via Multi-view 4D Remeshing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 613
DuoMo: Dual Motion Diffusion for World-Space Human Reconstruction
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 614
RAM: Recover Any 3D Human Motion in-the-Wild
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 615
From 2D Alignment to 3D Plausibility: Unifying Heterogeneous 2D Priors and Penetration-Free Diffusion for Occlusion-Robust Two-Hand Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 616
MV-Fashion: Towards Enabling Virtual Try-On and Size Estimation with Multi-View Paired Data
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 617
Forecasting 3D Scanpaths in Egocentric Video
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 618
M4Human: A Large-Scale Multimodal mmWave Radar Benchmark for Human Mesh Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 619
ReGenHOI: Unifying Reconstruction and Generation for 3D Human–Object Interaction Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 620
Through the Frequency Lens: Cross-Domain Generalisable Gaze Estimation with Adaptive Modulation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 621
Mocap-2-to-3: Multi-view Lifting for Monocular Motion Recovery with 2D Pretraining
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 622
SHands: A Multi-View Dataset and Benchmark for Surgical Hand-Gesture and Error Recognition Toward Medical Training
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 623
Beyond Static Frames: Temporal Aggregate-and-Restore Vision Transformer for Human Pose Estimation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 624
IMU-HOI: A Symbiotic Framework for Coherent Human-Object Interaction and Motion Capture via Contact-Conscious Inertial Fusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 625
Learning Forgery-Aware Lip Representations Without Forgery Priors
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 626
Beyond [CLS] Token: Query-Driven Token-Level Forgery Purification for Generalizable Deepfake Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 627
GEM-TFL: Bridging Weak and Full Supervision for Forgery Localization through EM-Guided Decomposition and Temporal Refinement
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 628
TokenTrace: Multi-Concept Attribution through Watermarked Token Recovery
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 629
Unleashing Vision-Language Semantics for Deepfake Video Detection
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 630
A Difference-in-Difference Approach to Detecting AI-Generated Images
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 631
RDFace: A Benchmark Dataset for Rare Disease Facial Image Analysis under Extreme Data Scarcity and Phenotype-Aware Synthetic Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 632
ActivityForensics: A Comprehensive Benchmark for Localizing Manipulated Activity in Videos
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 633
Zero-shot Detection of AI-Generated Image via RAW-RGB Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 634
Scaling Up AI-Generated Image Detection with Generator-Aware Prototypes
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 635
Investigating Self-Supervised Representations for Audio-Visual Deepfake Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 636
TIACam: Text-Anchored Invariant Feature Learning with Auto-Augmentation for Camera-Robust Zero-Watermarking
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 637
FastRef: Fast Prototype Refinement for Few-shot Industrial Anomaly Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 638
RC-NF: Robot-Conditioned Normalizing Flow for Real-Time Anomaly Detection in Robotic Manipulation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 639
Reasoning-Driven Anomaly Detection and Localization with Image-Level Supervision
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 641
Wavelet-Driven 3D Anomaly Detection under Pose-Agnostic and Sparse-View
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 642
Hunting Normality from Query Sample via Residual Learning for Generalist Anomaly Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 643
GPFlow: Gaussian Prototype Probability Flow for Unsupervised Multi-Modal Anomaly Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 644
HP-Edit: A Human-Preference Post-Training Framework for Image Editing
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 645
It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 646
RebRL: Reinforcing Discrete Visual Diffusion Models with Rebalanced Timestep Credits
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 647
Ego-InBetween: Generating Object State Transitions in Ego-Centric Videos
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 648
Towards Fine-Grained Attribution: Instance-Aware Preference Optimization for Aligning Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 649
SketchRevive: Fine-Grained Pixel-to-Vector Sketch Completion with Diffusion-Prior-Guided Multimodal LLMs
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 650
UniPercept: A Unified Diffusion Model for Generalizable Visual Perception
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 651
Visual Diffusion Models are Geometric Solvers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 652
You Only Erase Once: Erasing Anything without Bringing Unexpected Content
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 653
Smoothing the Score Function to Enhance Generalization in Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 654
NS-Diff: Fluid Navier–Stokes Guided Video Diffusion via Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 655
PropFly: Learning to Propagate via On-the-Fly Supervision from Pre-trained Video Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 656
Generative Neural Video Compression via Video Diffusion Prior
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 657
AdaCluster: Adaptive Query-Key Clustering for Sparse Attention in Video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 658
Denoising, Fast and Slow: Difficulty-Aware Adaptive Sampling for Image Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 659
Image Diffusion Preview with Consistency Solver
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 660
The Drift Kernel: Why Diffusion Models Change Even When Told Not To
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 661
Interpretable Prompts made Edit-Friendly: Token-to-Token Similarity Reduction in dLLMs for Edit-Friendly Hard Prompt Inversion
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 662
LESA: Learnable Stage-Aware Predictors for Diffusion Model Acceleration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 663
Vision Foundation Models Can Be Good Tokenizers for Latent Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 664
Adaptive Spectral Feature Forecasting for Diffusion Sampling Acceleration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 665
Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 666
EasyOmnimatte: Taming Pretrained Inpainting Diffusion Models for End-to-End Video Layered Decompositio
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 667
Hierarchical Codec Diffusion for Video-to-Speech Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 668
Semantic Alignment for Pose-Invariant Identity Preserving Diffusion
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 669
Causality in Video Diffusers is Separable from Denoising
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 670
2ndMatch: Finetuning Pruned Diffusion Models via Second-Order Jacobian Matching
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 671
Hear What You See: Video-to-Audio Generation with Diffusion Transformer and Semantic-Temporal Alignment-Ranked Direct Preference Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 672
MacTok: Robust Continuous Tokenization for Image Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 673
Group Editing: Edit Multiple Images in One Go
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 674
Adaptive Video Distillation: Mitigating Oversaturation and Temporal Collapse in Few-Step Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 675
Beyond the Golden Data: Resolving the Motion-Vision Quality Dilemma via Timestep Selective Training
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 676
Toward Diffusible High-Dimensional Latent Spaces: A Frequency Perspective
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 677
Elucidating the SNR-t Bias of Diffusion Probabilistic Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 678
What Is It Like to Be a Noise? An Entropy-based Gaussian Noise Regularization for Diffusion Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 679
FlashVSR: Towards Real-time Diffusion-Based Streaming Video Super Resolution
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 680
DiffusionHarmonizer: Bridging Neural Reconstruction and Photorealistic Simulation with Online Diffusion Enhancer
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 681
GDRO: Group-level Reward Post-training Suitable for Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 682
RFDM: Residual Flow Diffusion Models for Video Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 683
FreqEdit: Preserving High-Frequency Features for Robust Multi-Turn Image Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 684
Graph-Guided Online Concept Erasure for Text-to-Image Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 685
HierEdit: Region-Aware Hierarchical Diffusion for Efficient High-Resolution Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 686
CTCal: Rethinking Text-to-Image Diffusion Models via Cross-Timestep Self-Calibration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 687
Edit2Perceive: Image Editing Diffusion Models Are Strong Dense Perceivers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 688
DeltaQuant: 4-bit Video Diffusion Models with Spatiotemporal Delta Smoothing
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 689
D2Cache: Second-Order Delta Caching for Higher Video Diffusion Acceleration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 690
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 691
Test-Time Alignment of Text-to-Image Diffusion Models via Null-Text Embedding Optimisation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 692
Accelerating Diffusion Model Training under Minimal Budgets: A Condensation-Based Perspective
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 693
Denoising as Path Planning: Training-Free Acceleration of Diffusion Models with DPCache
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 694
Taming Sampling Perturbations with Variance Expansion Loss for Latent Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 695
Guiding Diffusion Models with Semantically Degraded Conditions
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 696
Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 698
Coupled Diffusion Sampling for Training-Free Multi-View Image Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 699
Improving Diffusion Generalization with Weak-to-Strong Segmented Guidance
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 700
Adaptive Auxiliary Prompt Blending for Target-Faithful Diffusion Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 701
SegQuant: A Semantics-Aware and Generalizable Quantization Framework for Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 702
BAgger: Backwards Aggregation for Mitigating Drift in Autoregressive Video Diffusion Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 703
Accelerating Autoregressive Video Diffusion via History-Guided Cache and Residual Correction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 704
MusicInfuser: Making Video Diffusion Listen and Dance
[
Poster]
Poster Session
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A None
Poster Session 6
Successful Page Load