Toggle Poster Visibility
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 1
Chorus: Multi-Teacher Pretraining for Holistic 3D Gaussian Scene Encoding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 2
Featurising Pixels from Dynamic 3D Scenes with Linear In-Context Learners
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 3
From Pairs to Sequences: Track-Aware Policy Gradients for Keypoint Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 4
Linear Fundamental Matrix Estimation from 7 or 5 Points
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 5
OccuFly: A 3D Vision Benchmark for Semantic Scene Completion from the Aerial Perspective
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 6
VGGT-Ω
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 7
CodeV: Code with Images for Faithful Visual Reasoning via Tool-Aware Policy Optimization
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 8
NitroGen: An Open Foundation Model for Generalist Gaming Agents
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 9
PAI-Bench: A Comprehensive Benchmark For Physical AI
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 10
RefAV: Towards Planning-Centric Scenario Mining
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 11
SoccerMaster: A Vision Foundation Model for Soccer Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 12
VS-Bench: Evaluating VLMs for Strategic Abilities in Multi-Agent Environments
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 13
Breaking the Scalability Limit of Multi-Projector Calibration with Embedded Cameras
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 14
GaussianFluent: Gaussian Simulation for Dynamic Scenes with Mixed Materials
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 15
InfiniBench: Infinite Benchmarking for Visual Spatial Reasoning with Customizable Scene Complexity
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 16
MAGICIAN: Efficient Long-Term Planning with Imagined Gaussians for Active Mapping
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 17
Memory-Augmented Scene Understanding and Exploration for Open-World Aerial Object-Goal Navigation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 18
Monocular Open Vocabulary Occupancy Prediction for Indoor Scenes
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 19
INSID3: Training-Free In-Context Segmentation with DINOv3
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 20
MARCO: Navigating the Unseen Space of Semantic Correspondence
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 21
PR-MaGIC: Prompt Refinement Via Mask Decoder Gradient Flow For In-Context Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 22
R^2-Seg: Training-Free OOD Medical Tumor Segmentation via Anatomical Reasoning and Statistical Rejection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 23
The SA-FARI Dataset: Segment Anything in Footage of Animals for Recognition and Identification
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 24
VGGT-Segmentor: Geometry-Enhanced Cross-View Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 25
DAGE: Dual-Stream Architecture for Efficient and Fine-Grained Geometry Estimation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 26
Wave-Former: Through-Occlusion 3D Reconstruction via Wireless Shape Completion
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 27
Lite Any Stereo: Efficient Zero-Shot Stereo Matching
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 28
MuM: Multi-View Masked Image Modeling for 3D Vision
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 29
ZipMap: Linear-Time Stateful 3D Reconstruction via Test-Time Training
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 30
Scal3R: Scalable Test-Time Training for Large-Scale 3D Reconstruction
[
Slides]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 31
LaRP: Efficient Multi-View Inpainting with Latent Reprojection Priors
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 32
TopoMA: Topology-Guided Multi-Agent Dense RGB 3D Reconstruction via Distributed Inference
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 33
Sparse–View Localization via Online Neural 3D Regression
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 34
Dynamic Visual SLAM using a General 3D Prior
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 35
Learning Scene Coordinate Reconstruction from Unposed Images via Pose Graph Optimization
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 36
FlashVGGT: Efficient and Scalable Visual Geometry Transformers with Compressed Descriptor Attention
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 37
No Calibration, No Depth, No Problem: Cross-Sensor View Synthesis with 3D Consistency
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 38
UFO: Unifying Feed-Forward and Optimization-based Methods for Large Driving Scene Modeling
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 39
Reliev3R: Relieving Feed-forward 3D Reconstruction from Multi-View Geometric Annotations
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 40
TALO: Pushing 3D Vision Foundation Models Towards Globally Consistent Online Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 41
Global Structure-from-Motion Meets Feedforward Reconstruction
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 42
POCA: Pareto-Optimal Curriculum Alignment for Visual Text Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 43
DuoGen: Towards Autonomous Interleaved Multimodal Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 44
Vibe Spaces for Creatively Connecting and Expressing Visual Concepts
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 45
StoryTailor:A Zero-Shot Pipeline for Action-Rich Multi-Subject Visual Narratives
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 46
CREward: A Type-Specific Creativity Reward Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 47
LumiX: Structured and Coherent Text-to-Intrinsic Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 48
Synthetic Curriculum Reinforces Compositional Text-to-Image Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 49
OmniGen2: Towards Instruction-Aligned Multimodal Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 50
Selectively Extracting and Injecting Visual Attributes into Text-to-Image Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 51
LoFA: Learning to Predict Personalized Prior for Fast Adaptation of Visual Generative Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 52
UniVerse: Empower Unified Generation with Reasoning and Knowledge
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 53
UniVerse: A Unified Modulation Framework for Segmentation-Free, Disentangled Multi-Concept Personalization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 54
Residual Decoder Adapter: ID-Preserving Tokenizer Adaption for Autoregressive Text Rendering
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 55
TGT: Text-Grounded Trajectories for Locally Controlled Video Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 56
RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free Text-to-Image Alignment
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 57
FlowFixer: Towards Detail-Preserving Subject-Driven Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 58
TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 59
UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 60
FEAT: Fashion Editing and Try-On from Any Design
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 61
Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 62
PointAlign: Feature-Level Alignment Regularization for 3D Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 63
PowerCLIP: Powerset Alignment for Contrastive Pre-Training
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 64
MoBind: Motion Binding for Fine-Grained IMU–Video Pose Alignment
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 66
Tackling Model Bias via Game-theoretic Multi-agent Collaboration Framework for Hateful Meme Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 67
CCCaption: Dual-Reward Reinforcement Learning for Complete and Correct Image Captioning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 68
MM-ReCoder: Advancing Chart-to-Code Generation with Reinforcement Learning and Self-Correction
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 69
Learning to Generate via Understanding: Understanding-Driven Intrinsic Rewarding for Unified Multimodal Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 70
Hierarchical Process Reward Models are Symbolic Vision Learners
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 71
ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 73
AcTTA: Rethinking Test-Time Adaptation via Dynamic Activation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 74
Reframing Long-Tailed Learning via Loss Landscape Geometry
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 75
Cleaning the Pool: Progressive Filtering of Unlabeled Pools in Deep Active Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 76
DC-Merge: Improving Model Merging with Directional Consistency
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 77
TALON: Test-time Adaptive Learning for On-the-Fly Category Discovery
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 78
Event-Illumination Collaborative Low-light Image Enhancement with a High-resolution Real-world Dataset
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 79
NEC-Diff: Noise-Robust Event-RAW Complementary Diffusion for Seeing Motion in Extreme Darkness
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 80
Towards Persistence: Learning Topological Constraints for Event-based Small Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 81
Geometric-Photometric Event-based 3D Gaussian Ray Tracing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 82
EventDrive: Event Cameras for Vision-Language Driving Intelligence
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 83
EventGait: Towards Robust Gait Recognition with Event Streams
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 84
MergeVLA: Cross-Skill Model Merging Toward a Generalist Vision-Language-Action Agent
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 85
Resolving the Stability-Plasticity Dilemma in Reinforcement Learning via Complementary Continual Critics
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 86
SAGE: Scalable Agentic 3D Scene Generation for Embodied AI
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 87
Semantic Audio-Visual Navigation in Continuous Environments
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 88
Unifying Perception and Action: A Hybrid-Modality Pipeline with Implicit Visual Chain-of-Thought for Robotic Action Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 89
FLARE: A Failure-Aware Framework for Autonomous Correction and Recovery in Visual-Language Robotic Manipulation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 90
Learning to Adapt: Self-Improving Web Agent via Cognitive-Aware Exploration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 91
General Process Reward Modeling for Robotic Reinforcement Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 93
Action-Sketcher: From Reasoning to Action via Visual Sketches for Robotic Manipulation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 94
Thinking in 360°: Humanoid Visual Search in the Wild
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 95
Learning from Semantic Dictionaries: Discriminative Codebook Contrastive Learning for Unified Visual Representation and Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 96
MagicQuill V2: Precise and Interactive Image Editing with Layered Visual Cues
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 97
Cycle-Consistent Tuning for Layered Image Decomposition
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 98
RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 99
Beyond Objects: Contextual Synthetic Data Generation for Fine-Grained Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 100
NEAF: Natural Image Editing with Attention Fusion for Generalizable Test-time Optimization in Text-Guided Image Editing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 101
OntoAug: Rethinking Generative Data Augmentation via Ontology Guidance
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 102
Spherical Voronoi: Directional Appearance as a Differentiable Partition of the Sphere
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 103
4DSurf: High-Fidelity Dynamic Scene Surface Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 104
Learning 3D Representations for Spatial Intelligence from Unposed Multi-View Images
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 105
Depth Peeling for High-Fidelity Gaussian-Enhanced Surfel Rendering
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 106
Intrinsic Image Fusion for Multi-View 3D Material Reconstruction
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 107
PackUV: Packed Gaussian UV Maps for 4D Volumetric Video
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 108
Opti-NeuS: Neural Reconstruction for Dual-Layered Transparent and Opaque Objects
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 109
PhysGaia: A Physics-aware Benchmark with Multi-Body Interactions for Dynamic Novel View Synthesis
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 110
MatSpray: Fusing 2D Material World Knowledge on 3D Geometry
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 111
OMoBlur: An Object Motion Blur Dataset and Benchmark for Real-World Local Motion Deblurring
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 112
Hybrid Agents for Image Restoration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 113
Zero-Shot Image Denoising via Hybrid Prior-Guided Pseudo Sample Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 114
Self-supervised Dynamic Heterogeneous Degradation Modeling for Unified Zero-Shot Image Restoration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 115
Next-Scale Prediction: A Self-Supervised Approach for Real-World Image Denoising
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 116
PhaSR: Generalized Image Shadow Removal with Physically Aligned Priors
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 117
UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 118
FastGaMer: Efficient GainMap Learning for Practical Inverse Tone Mapping
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 119
MDS-VQA: Model-Informed Data Selection for Video Quality Assessment
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 120
Seeing through Light and Darkness: Sensor-Physics Grounded Deblurring HDR NeRF from Single-Exposure Images and Events
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 121
Disentanglement-wise Image Dehazing through Cross-Domain Manifold Consensus
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 122
Unsupervised Multi-Scale Segmentation of 3D Subcellular World with Stable Diffusion Foundation Model
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 123
EchoPOSE: 6D Pose Estimation of Sparse Echocardiograms for Left-Ventricular 3D Shape Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 124
Spatial-SAM: Spatially Consistent 3D Electron Microscopy Segmentation with SDF Memory and Semi-Supervised Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 125
LLaDA-MedV: Exploring Large Language Diffusion Models for Biomedical Image Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 126
TAlignDiff: Automatic Tooth Alignment assisted by Diffusion-based Transformation Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 127
Harmonized Feature Conditioning and Frequency-Prompt Personalization for Multi-Rater Medical Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 128
Masked-Diffusion Autoencoders for 3D Medical Vision Representation Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 129
PGR-Net: Prior-Guided ROI Reasoning Network for Brain Tumor MRI Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 130
Test-Time Attention Purification for Backdoored Large Vision Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 131
AGFT: Alignment-Guided Fine-Tuning for Zero-Shot Adversarial Robustness of Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 132
Towards Robust Multimodal Large Language Models Against Jailbreak Attacks
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 133
R^2TUA: Reconstruction-residual Based Targeted and Untargeted Attack Against Text-Image Person Re-Identification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 134
When Robots Obey the Patch: Universal Transferable Patch Attacks on Vision-Language-Action Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 135
FlowHijack: A Dynamics-Aware Backdoor Attack on Flow-Matching Vision-Language-Action Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 136
Principled Steering via Null-space Projection for Jailbreak Defense in Vision-Language Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 137
Enhancing Part-Level Point Grounding for Any Open-Source MLLMs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 138
MeteorPred: A Meteorological Multimodal Large Model and Dataset for Severe Weather Event Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 139
YieldSAT: A Multimodal Benchmark Dataset for High-Resolution Crop Yield Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 140
How Far Can We Go With Synthetic Data for Audio-Visual Sound Source Localization?
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 141
Modeling Cross-vision Synergy for Unified Large Vision Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 142
Beyond Missing Modalities: Hypergraph Conditioned Diffusion for Uncertainty-Aware Multimodal Emotion Recognition
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 143
Rosetta Stone For Unified MLLMs: A Unified Tokenizer to Decipher Understanding and Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 144
MOON2.0: Dynamic Modality-balanced Multimodal Representation Learning for E-commerce Product Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 145
Nano-EmoX: Unifying Multimodal Emotional Intelligence from Perception to Empathy
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 146
AMusE: Audio-Visual Benchmark and Alignment Framework for Agentic Multi-Speaker Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 147
Prototype-as-Prompt: Multimodal Sentiment Prototypes Endowing Large Language Models the Capability to Perform Multimodal Sentiment Analysis
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 148
CF-IPT: Cross-Modal Fusion Interactive Prompt Tuning of Vision-Language Pre-Trained Model for Multisource Remote Sensing Data Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 149
EMAD: Evidence-Centric Grounded Multimodal Diagnosis for Alzheimer’s Disease
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 150
Multimodal Learning on Low-Quality Data with Conformal Predictive Self-Calibration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 151
Cross-View Distillation and Adaptive Masking for Incomplete Multi-View Multi-Label Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 152
Bootstrap Your Own AV-Proxies: Adaptive Contrastive and Prototype Learning for Audio-Visual Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 154
M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 155
Text-Driven 3D Hand Motion Generation from Sign Language Data
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 156
Real2Edit2Real: Generating Robotic Demonstrations via a 3D Control Interface
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 157
GenHOI: Towards Object-Consistent Hand–Object Interaction with Temporally Balanced and Spatially Selective Object Injection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 158
Clay-to-Stone: Phase-wise 3D Gaussian Splatting for Monocular Articulated Hand-Object Manipulation Modeling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 159
Training-free Motion Factorization for Compositional Video Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 160
Audio-sync Video Instance Editing with Granularity-Aware Mask Refiner
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 161
CaTok: Taming Mean Flows for One-Dimensional Causal Image Tokenization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 162
FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 163
V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 164
PoseAnything: General Pose-guided Video Generation with Part-aware Temporal Coherence
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 165
FastHybrid: Accelerating Hybrid Autoregressive Image Generation with Lookahead and Guided Decoding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 166
DPAR: Dynamic Patchification for Efficient Autoregressive Visual Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 167
AlcheMinT: Fine-grained Temporal Control for Multi-Reference Consistent Video Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 168
LeapAlign: Post-training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 169
EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 170
Flow Matching for Multimodal Distributions
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 171
From Scale to Speed: Adaptive Test-Time Scaling for Image Editing
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 172
ReasonEdit: Towards Reasoning-Enhanced Image Editing Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 173
Cross-Subject EEG-to-Video Reconstruction and Beyond
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 174
Rethinking Position Embedding as a Context Controller for Multi-Reference and Multi-Shot Video Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 175
Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 176
BiFM: Bidirectional Flow Matching for Few-Step Image Editing and Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 177
DTG-Restore: Training-Free Diffusion Refinement for Generative Video Super-Resolution
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 178
VABench: A Comprehensive Benchmark for Audio-Video Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 179
Relightful Video Portrait Harmonization
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 180
DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 181
DVAR: Dynamic Visual Autoregressive Modeling for Image Super-Resolution
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 182
Gated Condition Injection without Multimodal Attention: Towards Controllable Linear-Attention Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 184
UCAN: Unified Convolutional Attention Network for Expansive Receptive Fields in Lightweight Super-Resolution
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 186
RAW-Domain Degradation Models for Realistic Smartphone Super-Resolution
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 187
One-Step Diffusion Transformer for Controllable Real-World Image Super-Resolution
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 188
FRAMER: Frequency-Aligned Self-Distillation with Adaptive Modulation Leveraging Diffusion Priors for Real-World Image Super-Resolution
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 189
HDW-SR: High-Frequency Guided Diffusion Model based on Wavelet Decomposition for Image Super-Resolution
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 190
Unifying Precise Keyframes and Semantic Control via Multi-level Diffusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 191
CIGPose: Causal Intervention Graph Neural Network for Whole-Body Pose Estimation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 192
Pressure2Motion: Hierarchical Human Motion Reconstruction from Ground Pressure with Text Guidance
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 193
From 3D Pose to Prose: Biomechanics-Grounded Vision–Language Coaching
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 194
InterPrior: Scaling Generative Control for Physics-Based Human-Object Interactions
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 195
MoCoDiff: A Controllable Autoregressive Diffusion Model for Expressive Motion Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 196
W2W: Language-Model-Based Trajectory Prediction with Reinforcement Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 197
ParTY: Part-Guidance for Expressive Text-to-Motion Synthesis
[
Slides]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 198
Interact2Ar: Full-Body Human-Human Interaction Generation via Autoregressive Diffusion Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 199
Unified Number-Free Text-to-Motion Generation Via Flow Matching
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 200
Generative Diffusion Priors for 3D Mapping of the Dark Universe
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 201
FlowPalm: Optical Flow Driven Non-Rigid Deformation for Geometrically Diverse Palmprint Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 202
DiffuView: Multi-View Diffusion Pretraining for 3D Aware Robotic Manipulation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 203
Circuit Mechanisms for Spatial Relation Generation in Diffusion Transformers
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 204
Dual Ascent Diffusion for Inverse Problems
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 205
Forecast the Principal, Stabilize the Residual: Subspace-Aware Feature Caching for Diffusion Transformers
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 206
Spatial-Spectral Residuals Informed Diffusion Neural Operator for Pan-sharpening
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 207
PhyOceanCast: Global Ocean Forecasting with Physics-Informed Diffusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 208
Pixel Motion Diffusion is What We Need for Robot Control
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 210
M3Grounder: Mask-Based Multi-Span and Multi-Granular Grounding for Document QA
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 211
BabyVLM-V2: Toward Developmentally Grounded Pretraining and Benchmarking of Vision Foundation Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 212
Towards Real-World Document Parsing via Realistic Scene Synthesis and Document-Aware Training
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 213
RoadSceneBench: A Lightweight Benchmark for Mid-Level Road Scene Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 214
UNICBench: UNIfied Counting Benchmark for MLLM
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 215
CaptionQA: Is Your Caption as Useful as the Image Itself?
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 216
EgoProx: Evaluating MLLMs on Egocentric 3D Proximity Reasoning Across a Cognitive Hierarchy
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 217
VULCAN: Tool-Augmented Multi Agents for Iterative 3D Object Arrangement
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 219
Efficient Encoder-Free Fourier-based 3D Large Multimodal Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 220
Socratic-Geo: Synthetic Data Generation and Cross-Modal Geometric Reasoning via Multi-Agent Interaction
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 221
HAMMER: Harnessing MLLMs via Cross-Modal Integration for Intention-Driven 3D Affordance Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 222
Proxy3D: Efficient 3D Representations for Vision-Language Models via Semantic Clustering and Alignment
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 223
ReLaGS: Relational Language Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 224
3D-IDE: 3D Implicit Depth Emergent
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 226
Parse, Search, and Confirmation: Training-Free Aerial Vision-and-Dialog Navigation with Chain-of-Thought Reasoning and Structured Spatial Memory
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 227
4DP-QA: Scalable QA for 4D Perception in Vision Language Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 228
LASAR: Towards Spatio-temporal Reasoning with Latent Cognitive Map
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 229
Text-Phase Synergy Network with Dual Priors for Unsupervised Cross-Domain Image Retrieval
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 230
EagleNet: Energy-Aware Fine-Grained Relationship Learning Network for Text-Video Retrieval
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 231
PIX-TAB: Efficient PIXel-Precise TABle Structure Recognition Approach with Speculative Decoding and Region-Based Image Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 232
CARLoS: Retrieval via Concise Assessment Representation of LoRAs at Scale
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 233
Camouflage-aware Image-Text Retrieval via Expert Collaboration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 234
TriSim: Tri-Dimensional Similarity Modeling with Extreme Value Theory for False-Negative Mitigation in Remote Sensing Image-Text Retrieval
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 235
TIGER: A Unified Framework for Time, Images and Geo-location Retrieval
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 236
Mistake Attribution: Fine-Grained Mistake Understanding in Egocentric Videos
[
Slides]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 237
VidTAG: Temporally Aligned Video to GPS Geolocalization with Denoising Sequence Prediction at a Global Scale
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 238
Stitch-a-Demo: Creating Video Demonstrations from Multistep Descriptions
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 239
Prototypical Action Reasoning Facilitated by Vision-Language Alignment for Egocentric Action Anticipation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 240
AdaSpot: Spend Resolution Where It Matters for Precise Event Spotting
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 241
Unique Lives, Shared World: Learning from Single-Life Videos
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 242
Symphony: A Cognitively-Inspired Multi-Agent System for Long-Video Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 243
VideoARM: Agentic Reasoning over Hierarchical Memory for Long-Form Video Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 244
Wavelet-based Frame Selection by Detecting Semantic Boundary for Long Video Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 245
SVAgent: Storyline-guided Long Video Understanding via Cross-Modal Multi-Agent Collaboration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 246
Frame2Freq: Spectral Adapters for Fine-Grained Video Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 247
Structural Graph Probing of Vision–Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 248
Saliency-R1: Enforcing Interpretable and Faithful Vision-language Reasoning via Saliency-map Alignment Reward
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 250
MaskDiME: Adaptive Masked Diffusion for Precise and Efficient Visual Counterfactual Explanations
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 251
TRANSPORTER: Transferring Visual Semantics from VLM Manifolds
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 252
Relational Visual Similarity
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 253
PointCNN++: Performant Convolution on Native Points
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 254
Fast Markov Random Field Optimisation for Topologically Noisy 3D Shape Matching
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 255
LitePT: Lighter Yet Stronger Point Transformer
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 256
SuP: Sub-cloud Driven Point Cloud Registration
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 257
PQDT: Pseudo-Query Dual Transformer for Robust Point Cloud Restoration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 258
Test-Time Training for LiDAR Semantic Segmentation under Corruption via Geometric Inlier Discrimination
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 259
MHopReg: Efficient Hierarchical Multi-Hop Graph Search for Point Cloud Registration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 260
GEM: Generating LiDAR World Model via Deformable Mamba
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 261
Hybrid Robust Collaborative Perception with LiDAR-4D Radar Fusion under Adverse Weather Conditions
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 262
Task-Driven Implicit Representations for Automated Design of LiDAR Systems
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 263
Hierarchical Point-Patch Fusion with Adaptive Patch Codebook for 3D Shape Anomaly Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 264
When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 265
Beyond Layer-Wise Merging: Chain-of-Merging for Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 266
GazeShift: Unsupervised Gaze Estimation and Dataset for VR
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 267
Improving Calibration in Test-Time Prompt Tuning for Vision-Language Models via Data-Free Flatness-Aware Prompt Pretraining
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 268
Reevaluating the Intra-Modal Misalignment Hypothesis in CLIP
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 269
Dr. Seg: Revisiting GRPO Training for Visual Large Language Models through Perception-Oriented Design
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 270
Soft Modality-Guided Expert Specialization in MoE-VLMs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 271
CoVFT: Context-aware Visual Fine-tuning for Multimodal Large Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 273
AutoRegressive Generation with B-rep Holistic Token Sequence Representation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 274
VecGlypher: Unified Vector Glyph Generation with Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 275
NERFIFY: A Multi-Agent Framework for Turning NeRF Papers into Code
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 276
Diagram2Structure: Unlocking LLMs' Diagram Comprehension through DiagramDiff, an Offline Diagram Structuring Framework
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 277
ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 278
GardenDesigner: Encoding Aesthetic Principles into Jiangnan Garden Construction via a Chain of Agents
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 279
ShadowDraw: From Any Object to Shadow-Drawing Compositional Art
[
Slides]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 280
End-to-End Hyper-Relational Information Extraction for Engineering Diagrams via Dynamically Tokenized Relation Transformer
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 281
When Anonymity Breaks: Identifying Models Behind Text-to-Image Leaderboards
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 282
Bias at the End of the Score
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 283
PECCVAI: Overcoming the Brittleness of AI Image Watermarking Under Visual Paraphrasing Attacks
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 284
Dynamic Token Reweighting for Robust Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 285
COPYLENS: Towards Copyrighted Characters Infringement Detection via Copyright-Aware Prompt Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 286
Closed-Form Concept Erasure via Double Projections
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 287
Adaptive Bayesian Early-Exit Networks for Efficient Non-Transferable Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 288
Stake the Points: Structure-Faithful Instance Unlearning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 289
Federated Active Learning Under Extreme Non-IID and Global Class Imbalance
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 290
FedRG: Unleashing the Representation Geometry for Federated Learning with Noisy Clients
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 291
FedCART: Tackling Long-Tailed Distributions in Federated Adversarial Training via Classifier Refinement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 292
Generalized and Personalized Federated Learning with Black-Box Foundation Models via Orthogonal Transformations
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 293
Fully Decentralized Certified Unlearning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 294
Fed-ADE: Adaptive Learning Rate for Federated Post-adaptation under Distribution Shift
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 295
Towards Streaming Referring Video Segmentation via Large Language Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 296
Multi-speaker Attention Alignment for Multimodal Social Interaction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 297
OmniVTG: A Large-Scale Dataset and Training Paradigm for Open-World Video Temporal Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 298
SARL-STG: A Spatially Aware Reinforcement Learning Framework for Refining MLLMs in Spatio-Temporal Video Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 299
VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 301
UniCompress: Token Compression for Unified Vision–Language Understanding and Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 302
StreamingTOM: Streaming Token Compression for Efficient Video Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 303
SCoRe: Salience-Coverage Reduction for Vision Token Pruning in Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 304
VLM-PTQ: Efficient Post-Training Quantization for Large Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 305
Aligning What Vision-Language Models See and Perceive with Adaptive Information Flow
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 307
Rethinking Token Reduction for Large Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 308
Prototype-based Causal Intervention for Multi-Label Image Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 309
FAST: Topology-Aware Frequency-Domain Distribution Matching for Coreset Selection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 310
Face-Guided Sentiment Boundary Enhancement for Weakly-Supervised Temporal Sentiment Localization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 311
Evidential Deep Partial Label Learning to Quantify Disambiguation Uncertainty
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 312
Unlocking Strong Supervision: A Data-Centric Study of General-Purpose Audio Pre-Training Methods
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 313
Revisiting Learning with Noisy Labels: Active Forgetting and Noise Suppression
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 314
PAF: Perturbation-Aware Filtering for Open-Set Semi-Supervised Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 315
Global-Graph Guided and Local-Graph Weighted Contrastive Learning for Unified Clustering on Incomplete and Noise Multi-View Data
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 316
Enhancing Out-of-Distribution Detection with Extended Logit Normalization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 318
Unposed-to-3D: Learning Simulation-Ready Vehicles from Real-World Images
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 319
SafeDrive: Fine-Grained Safety Reasoning for End-to-End Driving in a Sparse World
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 320
RAG-TP: A General Framework for Vehicle Trajectory Prediction via Retrieval-Augmented Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 321
Perceiving the Near, Reasoning the Distant: Coherent Long-Horizon Trajectory Prediction for Autonomous Driving
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 322
Dual-Agent Reinforcement Learning for Adaptive and Cost-Aware Visual–Inertial Odometry
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 323
HorizonForge: Driving Scene Editing with Any Trajectories and Any Vehicles
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 324
AMap: Distilling Future Priors for Ahead-Aware Online HD Map Construction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 325
WAM-Flow: Parallel Coarse-to-Fine Motion Planning via Discrete Flow Matching for Autonomous Driving
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 326
PlannerRFT: Reinforcing Diffusion Planners through Closed-Loop and Sample-Efficient Fine-Tuning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 327
MARIS: Marine Open-Vocabulary Instance Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 328
XSeg: A Large-scale X-ray Contraband Segmentation Benchmark For Real-World Security Screening
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 329
Training-Free Open-Vocabulary Camouflaged Object Segmentation via Fine-Grained Object Binding and Adaptive Hybrid Prompt
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 330
M⁴-SAM: Multi-Modal Mixture-of-Experts with Memory-Augmented SAM for RGB-D Video Salient Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 331
ReAttnCLIP: Training-Free Open-Vocabulary Remote Sensing Image Segmentation via Re-defined Attention in CLIP
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 332
Mixture of Prototypes for Test-time Adaptive Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 333
Reconstruction-Guided Slot Curriculum: Addressing Object Over-Fragmentation in Video Object-Centric Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 334
ELVIS: Enhance Low-Light for Video Instance Segmentation in the Dark
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 335
Decouple Your Discovery and Memory in Continual Generalized Category Discovery
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 336
Beyond the Static World: Continual Category Discovery under Visual Drift
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 337
Memory-Efficient Transfer Learning with Fading Side Networks via Masked Dual Path Distillation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 338
SAME: Sparse and Anchored Model Editing for Heterogeneous Incremental Learning under Limited Data
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 339
CHEEM: Continual Learning by Reuse, New, Adapt and Skip - A Hierarchical Exploration-Exploitation Approach
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 340
Exemplar-Free Continual Learning for State Space Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 341
A Faster Path to Continual Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 342
Continual Learning for fMRI-Based Brain Disorder Diagnosis via Functional Connectivity Matrices Generative Replay
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 343
BeautyGRPO: Aesthetic Alignment for Face Retouching via Dynamic Path Guidance and Fine-Grained Preference Modeling
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 344
SyncDreamer: Controllable and Expressive Avatar Generation Beyond the Talking Head
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 345
PerformRecast: Expression and Head Pose Disentanglement for Portrait Video Editing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 346
UniLS: End-to-End Audio-Driven Avatars for Unified Listening and Speaking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 347
PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 348
FlashPortrait: 6x Faster Infinite Portrait Animation with Adaptive Latent Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 349
DriveVLN: Towards Mapless Vision-and-Language Navigation in Autonomous Driving
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 350
Towards Open Environments and Instructions: General Vision-Language Navigation via Fast-Slow Interactive Reasoning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 351
Unifying Language-Action Understanding and Generation for Autonomous Driving
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 352
Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 353
Prune2Drive: A Plug-and-Play Framework for Accelerating Vision-Language Models in Autonomous Driving
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 354
CGHair: Compact Gaussian Hair Reconstruction with Card Clustering
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 355
HyperGaussians: High-Dimensional Gaussian Splatting for High-Fidelity Animatable Face Avatars
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 356
Skullptor: High Fidelity 3D Head Reconstruction in Seconds with Multi-View Normal Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 357
RelightAnyone: A Generalized Relightable 3D Gaussian Head Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 358
Feed-forward Gaussian Registration for Head Avatar Creation and Editing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 359
Residual Decoding: Mitigating Hallucinations in Large Vision-Language Models via History-Aware Residual Guidance
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 360
Prefill-Time Intervention for Mitigating Hallucination in Large Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 361
SVHalluc: Benchmarking Speech–Vision Hallucination in Audio-Visual Large Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 362
Same Attention, Different Truths: Put Logit-Lens over Visual Attention to Detect and Mitigate LVLM Object Hallucination
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 363
Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 364
Lyapunov Probes for Hallucination Detection in Large Foundation Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 365
Captain Safari: A World Engine with Pose-Aligned 3D Memory
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 366
Gen3R: 3D Scene Generation Meets Feed-Forward Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 367
PerpetualWonder: Long-horizon Action-conditioned 4D Scene Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 368
CineScene: Implicit 3D as Effective Scene Representation for Cinematic Video Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 369
DreamStereo: Towards Real-Time Stereo Inpainting for HD Videos
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 371
RecEdit-Drive: 3D Reconstruction-Guided Spatiotemporal Video Editing for Autonomous Driving Scenes
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 372
RAYNOVA: Scale-Temporal Autoregressive World Modeling in Ray Space
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 373
RigMo: Unifying Rig and Motion Learning for Generative Animation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 374
LaVR: Scene Latent Conditioned Generative Video Trajectory Re-Rendering using Large 4D Reconstruction Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 376
Detect Anything via Next Point Prediction
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 378
Distribution-Aligned Multimodal Fusion for Robust Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 379
PaQ-DETR: Learning Pattern and Quality-Aware Dynamic Queries for Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 381
Efficiency Follows Global-Local Decoupling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 382
VRCLIP: Multimodal Canonical Correlation Alignment for CLIP-Driven Vision-Radio Person Re-Identification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 383
EReCu: Pseudo-label Evolution Fusion and Refinement with Multi-Cue Learning for Unsupervised Camouflage Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 384
Expert-Teacher-Student Collaborative Learning for Domain Adaptive Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 385
CI-VID: A Coherent Interleaved Text-Video Dataset
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 386
Generalizable Video Quality Assessment via Weak-to-Strong Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 387
EgoSound: Benchmarking Sound Understanding in Egocentric Videos
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 389
GIFT: Global Irreplaceability Frame Targeting for Efficient Video Understanding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 390
Select Less, Reason More: Prioritizing Evidence Purity for Video Reasoning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 391
Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 392
Compositional Transformation Reasoning for Composed Video Retrieval
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 393
UniVBench: Towards Unified Evaluation for Video Foundation Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 394
NAMI: Efficient Image Generation via Bridged Progressive Rectified Flow Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 395
InverFill: One-Step Inversion for Enhanced Few-Step Diffusion Inpainting
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 396
TimeRipples: Accelerating vDiTs by Understanding the Spatio-Temporal Correlations in Latent Space
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 397
ProcessMaker: A Generalized Process Visualization Framework with Adaptive Sequence Steps on Diffusion Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 398
MeanFlow Transformers with Representation Autoencoders
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 399
DiT-IC: Aligned Diffusion Transformer for Efficient Image Compression
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 400
FARMER: Flow AutoRegressive Transformer over Pixels
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 401
Probabilistic Precipitation Nowcasting with Rectified Flow Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 402
FlowDC: Flow-Based Decoupling-Decay for Complex Image Editing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 403
High-Fidelity Diffusion Face Swapping with ID-Constrained Facial Conditioning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 404
3D-Object Perception Transformer (3PT)
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 405
SemLT3D: Semantic-Guided Expert Distillation for Camera-only Long-Tailed 3D Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 406
Spe-BEVHead: Rethinking the Detection Head Design for Bird’s-Eye-View Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 407
Unsupervised Multi-agent and Single-agent Perception from Cooperative Views
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 408
Zoo3D: Zero-Shot 3D Object Detection at Scene Level
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 409
Beyond Appearance: Camouflaged Object Detection via Geometric Structure
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 410
SABER: Spatially Consistent 3D Universal Adversarial Objects for BEV Detectors
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 411
AceTone: Bridging Words and Colors for Conditional Image Grading
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 412
Do VLMs Perceive or Recall? Probing Visual Perception vs. Memory with Classic Visual Illusions
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 413
Pixels Don't Lie (But Your Detector Might): Bootstrapping MLLM-as-a-Judge for Trustworthy Deepfake Detection and Reasoning Supervision
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 414
UI-Lens: Assessing General MLLMs’ Potential to Automate UI Display Quality Assurance
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 415
Seeing is Improving: Visual Feedback for Iterative Text Layout Refinement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 416
Is your VLM Sky-Ready? A Comprehensive Spatial Intelligence Benchmark for UAV Navigation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 417
Linking Perception, Confidence and Accuracy in MLLMs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 418
AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 419
Learning to Focus and Precise Cropping: A Reinforcement Learning Framework with Information Gaps and Grounding Loss for MLLMs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 420
From Pixel to Precision: Enhancing Handwritten Mathematical Expression Recognition with Image-Level Reward
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 421
Rethinking Pose Refinement in 3D Gaussian Splatting under Pose Prior and Geometric Uncertainty
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 422
Revisiting Pose Sensitivity in Splat-based Computed Tomography under Sparse-view Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 423
Seele: A Unified Acceleration Framework for Real-Time Gaussian Splatting on Mobile Devices
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 424
GHPT: Real-Time Relightable Gaussian Splatting using Hybrid Path Tracing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 425
PolarGuide-GSDR: 3D Gaussian Splatting Driven by Polarization Priors and Deferred Reflection for Real-World Reflective Scenes
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 426
EcoSplat: Efficiency-controllable Feed-forward 3D Gaussian Splatting from Multi-view Images
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 427
SGS-Intrinsic: Semantic-Invariant Gaussian Splatting for Sparse-View Indoor Inverse Rendering
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 428
GIFSplat: Generative Prior-Guided Iterative Feed-Forward 3D Gaussian Splatting from Sparse Views
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 429
3D Gaussian Splatting with Self-Constrained Priors for High Fidelity Surface Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 430
FilterGS: Traversal-Free Parallel Filtering and Adaptive Shrinking for Large-Scale LoD 3D Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 431
TWINGS: Thin Plate Splines Warp-aligned Initialization for Sparse-View Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 432
VarSplat: Uncertainty-aware 3D Gaussian Splatting for Robust RGB-D SLAM
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 434
FastGS: Training 3D Gaussian Splatting in 100 Seconds
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 435
BrepGaussian: CAD reconstruction from Multi-View Images with Gaussian Splatting
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 436
ODGS-SLAM: Omnidirectional Gaussian Splatting SLAM
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 437
BA-GS: Bayesian Adaptive Gaussian Splatting for SFM-Free 3D Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 438
FSFSplatter: Geometrically Accurate Reconstruction with Free Sparse-view Images within 2 minutes
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 439
ViRC: Enhancing Visual Interleaved Mathematical CoT with Reason Chunking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 440
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 441
PixDLM: A Dual-Path Multimodal Language Model for UAV Reasoning Segmentation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 442
Can a Second-View Image Be a Language? Geometric and Semantic Cross-Modal Reasoning for X-ray Prohibited Item Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 443
VCU-Bridge: Hierarchical Visual Connotation Understanding via Semantic Bridging
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 444
Learning to See through Illumination Extremes with Event Streaming in Multimodal Large Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 445
VOLD: Reasoning Transfer from LLMs to Vision-Language Models via On-Policy Distillation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 446
Cut to the Chase: Training-free Multimodal Summarization via Chain-of-Events
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 447
UVU: Improving Multimodal Understanding via Vision-Language Unified Autoregressive Paradigm
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 448
PointThinker: Point-Incentivized Parallel Thinking for Multimodal Large Language Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 449
OctoMed: Data Recipes for State-of-the-Art Multimodal Medical Reasoning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 450
HoneyBee: Data Recipes for Vision-Language Reasoners
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 451
VisPlay: Self-Evolving Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 452
Chart-FR1: Visual Focus-Driven Fine-Grained Reasoning on Dense Charts
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 453
Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 454
ApET: Approximation-Error Guided Token Compression for Efficient VLMs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 455
Granulon: Awakening Pixel-Level Visual Encoders with Adaptive Multi-Granularity Semantics for MLLM
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 456
Vision Transformers Need More Than Registers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 457
Head-wise Adaptive Rotary Positional Encoding for Fine-Grained Image Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 458
PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 459
AdaSVD: Singular Value Decomposition with Adaptive Mechanisms for Large Multimodal Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 460
ReFTA: Breaking the Weight Reconstruction Bottleneck in Tensorized Parameter-Efficient Fine-Tuning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 461
HTTM: Head-wise Temporal Token Merging for Faster VGGT
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 463
Self-Attention Driven Tensor Representation for High-Order Data Recovery
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 464
PlanaReLoc: Camera Relocalization in 3D Planar Primitives via Region-Based Structure Matching
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 465
MOGeo: Beyond One-to-One Cross-View Object Geo-localization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 466
Homaloidal parametrization for detecting critical two-view configurations
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 467
AsymLoc: Towards Asymmetric Feature Matching for Efficient Visual Localization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 469
Asking like Socrates: Socrates helps VLMs understand remote sensing images
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 470
GTR-Turbo: Merged Checkpoint is Secretly a Free Teacher for Agentic VLM Training
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 471
Let VLMs Grade Their Own Thoughts: A Self-Quantification Approach to Reasoning-Aware Reward Modeling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 472
SciEducator: Scientific Video Understanding and Educating via Deming-Cycle Multi-Agent System
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 473
SenseSearch: Empowering Vision-Language Models with High-Resolution Agentic Search-Reasoning via Reinforcement Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 474
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 475
VideoSSR: Video Self-Supervised Reinforcement Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 476
Neurodynamics-Driven Coupled Neural P Systems for Multi-Focus Image Fusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 477
MagicFuse: Single Image Fusion for Visual and Semantic Reinforcement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 478
Bridging Pixels and Words: Mask-Aware Local Semantic Fusion for Multimodal Media Verification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 479
Human-Centric Multi-Exposure Fusion: Benchmark and Bi-level Cognition Distillation Framework
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 480
ConceptPose: Training-Free Zero-Shot Object Pose Estimation using Concept Vectors
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 481
A Closer Look at Cross-Domain Few-Shot Object Detection: Fine-Tuning Matters and Parallel Decoder Helps
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 482
NAF: Zero-Shot Feature Upsampling via Neighborhood Attention Filtering
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 483
Universal-to-Specific: Dynamic Knowledge-Guided Multiple Instance Learning for Few-Shot Whole Slide Image Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 484
SOTA: Self-adaptive Optimal Transport for Zero-Shot Classification with Multiple Foundation Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 485
Uni-DAD: Unified Distillation and Adaptation of Diffusion Models for Few-step Few-shot Image Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 488
IMS3: Breaking Distributional Aggregation in Diffusion-Based Dataset Distillation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 489
Continuous Exposure-Time Modeling for Realistic Atmospheric Turbulence Synthesis
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 490
240FPS Stereo Vision from Monocular Mixed Spikes
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 491
D^2-FOSA: Dual-Diffusion Guided EEG-to-Image Reconstruction with Frequency-Oriented Semantic Alignment
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 492
Self-Diffusion Driven Blind Imaging
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 493
Differentiable Stroke Planning with Dual Parameterization for Efficient and High-Fidelity Painting Creation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 494
Solvability of the Viewing Graph Under the Affine Camera Model
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 495
DiffBMP: Differentiable Rendering with Bitmap Primitives
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 496
Splat-Based Metal Artifact Reduction in Cone-Beam CT via Compact Attenuation Modeling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 497
Lumosaic: Hyperspectral Video via Active Illumination and Coded-Exposure Pixels
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 498
Towards Universal Computational Aberration Correction in Photographic Cameras: A Comprehensive Benchmark Analysis
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 499
Multi-View Hierarchical Alignment Learning for Spatial Transcriptomics
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 500
FEAST: Fully Connected Expressive Attention for Spatial Transcriptomics
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 502
OrienPose: Orientation-Guided Novel View Synthesis for Single-Image Unseen Object Pose Estimation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 503
Illustrator’s Depth: Monocular Layer Index Prediction for Image Decomposition
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 504
Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 505
Seeing Depth Through Frequency and Motion: A Progressive Training Paradigm for Monocular Depth Estimation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 506
GeoGuide: Hierarchical Geometric Guidance for Open-Vocabulary 3D Semantic Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 508
PE3R: Perception-Efficient 3D Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 509
GS-ASM: 2DGS-Supervised Active Stereo Matching
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 510
Real2Sim2Real: RetinalDepth-64K for Depth Estimation in Posterior Segment Ophthalmic Surgery
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 511
Iris: Bringing Real-World Priors into Diffusion Model for Monocular Depth Estimation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 512
InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 513
AirSim360: A Panoramic Simulation Platform within Drone View
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 514
Radar-Guided Polynomial Fitting for Metric Depth Estimation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 515
UniDAC: Universal Metric Depth Estimation for Any Camera
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 517
I-Scene: 3D Instance Models are Implicit Generalizable Spatial Learners
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 518
REVIVE 3D: Refinement via Encoded Voluminous Inflated prior for Volume Enhancement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 519
Muses: Designing, Composing, Generating Nonexistent Fantasy 3D Creatures without Training
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 520
EI-Part: Explode for Completion and Implode for Refinement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 521
MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 522
Fast3Dcache: Training-free 3D Geometry Synthesis Acceleration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 523
ViLearn: Accelerating Training Convergence of Image-to-3D Generation via Visibility Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 524
FlashMesh: Faster and Better Autoregressive Mesh Synthesis via Structured Speculation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 525
X-Part: High Fidelity And Structure Coherent Shape Decomposition And Completion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 526
Realiz3D: 3D Generation Made Photorealistic via Domain-Aware Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 527
TopoMesh: High-Fidelity Mesh Autoencoding via Topological Unification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 528
Nestwork: Conditional 3D Furnished House Layout Generation through Latent Heterogeneous Graph Diffusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 529
TEXTRIX: Latent Attribute Grid for Native Texture Generation and Beyond
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 530
Beyond Geometry: Artistic Disparity Synthesis for Immersive 2D-to-3D
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 531
WorldGen: From Text to Traversable and Interactive 3D Worlds
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 532
ExMesh: EXplicit Mesh Reconstruction with Topology Adaptation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 533
SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 534
ShapeR: Robust Conditional 3D Shape Generation from Casual Captures
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 536
3DrawAgent: Teaching LLM to Draw in 3D with Early Contrastive Experience
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 537
Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 538
HiFi-BRep: High-Fidelity Latent Representation for Robust B-Rep Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 539
PhysGen: Physically Grounded 3D Shape Generation for Industrial Design
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 540
Perceptual 3D Simulation With Physical World Modeling
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 541
EchoFoley: Event-Centric Hierarchical Control for Video Grounded Creative Sound Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 542
Active Intelligence in Video Avatars via Closed-loop World Modeling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 543
Enhancing Spatial Understanding in Image Generation via Reward Modeling
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 544
Seeing What Matters: Visual Preference Policy Optimization for Visual Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 545
TAG-MoE: Task-Aware Gating for Unified Generative Mixture-of-Experts
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 546
Identity-Preserving Image-to-Video Generation via Reward-Guided Optimization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 547
JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 548
Learning Latent Proxies for Controllable Single-Image Relighting
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 549
MoVie: Broaden Your Views with Human Motion for Action Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 550
MooCap: A Multi-View Benchmark for Cow-Object-Human Interaction and Behavior Dynamics
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 551
LAOF: Robust Latent Action Learning with Optical Flow Constraints
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 552
DarkAct: A RGB-Thermal Dataset and Fusion Framework for Multimodal Low-Light Action Recognition
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 554
Steering Where to Diffuse: Generative Modeling of Phenotypic Response Simulation with Steered Diffusion Bridge
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 557
RNN as Linear Transformer: A Closer Investigation into Representational Potentials of Visual Mamba Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 558
Coupling Liquid Time‑Constant Encoders with Modern Hopfield Memory
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 559
Stronger Normalization-Free Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 560
HCL-FF: Hierarchical and Contrastive Learning for Forward-Forward Algorithm
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 561
Can You Learn to See Without Images? Procedural Warm-Up for Vision Transformers
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 562
Convolutional Neural Networks Driven by Content Similarity
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 563
MorphSeek: Fine-grained Latent Representation-Level Policy Optimization for Deformable Image Registration
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 564
HATS: Hardness-Aware Trajectory Synthesis for GUI Agents
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 565
MVP: Multiple View Prediction Improves GUI Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 566
Towards GUI Agents: Vision-Language Diffusion Models for GUI Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 567
ProactiveMobile: A Comprehensive Benchmark for Boosting Proactive Intelligence On Mobile Devices
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 568
OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 569
Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 571
Beyond Weak Supervision: MLLMs-Guided Graded Knowledge Distillation for Unsupervised Camouflaged Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 572
Detecting Unknown Objects via Energy-based Separation for Open World Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 573
Beyond Prompt Degradation: Prototype-guided Dual-pool Prompting for Incremental Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 574
SPAR: Single-Pass Any-Resolution ViT for Open-vocabulary Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 575
TTL: Test-time Textual Learning for OOD Detection with Pretrained Vision-Language Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 576
Parameterized Prompt for Incremental Object Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 577
SRA-Det: Learning Omni-Grained Open-Vocabulary Detection Beyond Category Names
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 578
Retrieve and Segment: Are a Few Examples Enough to Bridge the Supervision Gap in Open-Vocabulary Segmentation?
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 579
PCA-Seg: Revisiting Cost Aggregation for Open-Vocabulary Semantic and Part Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 580
Partial Weakly-Supervised Oriented Object Detection
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 581
Seeing Both Sides: Towards Bidirectional Semantic Alignment for Open-Vocabulary Camouflaged Object Segmentation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 582
Towards Robust Multi-Modal Semantic Segmentation with Teacher-Student Framework and Hybrid Prototype Distillation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 583
REL-SF4PASS: Panoramic Semantic Segmentation with REL Depth Representation and Spherical Fusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 584
Looking Beyond the Window: Global-Local Aligned CLIP for Training-free Open-Vocabulary Semantic Segmentation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 585
From Softmax to Dirichlet: Evidential Learning for Semi-supervised Semantic Segmentation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 586
Particulate: Feed-Forward 3D Object Articulation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 587
HOPS: Hierarchical Open-vocabulary Part Segmentation with Attention-Aware Filtering and Affinity-Guided Enhancement
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 588
Shape-of-You: Fused Gromov-Wasserstein Optimal Transport for Semantic Correspondence in-the-Wild
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 589
MEMO: Human-like Crisp Edge Detection Using Masked Edge Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 590
MUFASA: A Multi-Layer Framework for Slot Attention
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 591
ChangeBridge: Spatiotemporal Image Generation with Multimodal Controls for Remote Senisng
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 592
MOMO: Mars Orbital MOdel Foundation Model for Mars Orbital Applications
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 593
Seeing Through the Noise: Improving Infrared Small Target Detection and Segmentation from Noise Suppression Perspective
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 594
GeoBridge: A Semantic-Anchored Multi-View Foundation Model Bridging Images and Text for Geo-Localization
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 595
GeoSANE: Learning Geospatial Representations from Models, Not Data
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 596
Brewing Stronger Features: Dual-Teacher Distillation for Multispectral Earth Observation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 597
Spectral Super-Resolution via Adversarial Unfolding and Data-Driven Spectrum Regularization: From Multispectral Satellite Data to NASA Hyperspectral Image
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 598
RAMEN: Resolution-Adjustable Multimodal Encoder for Earth Observation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 599
ORSATR-X: A Foundation Model based on Differential-and-Excitation Networks for Optical Remote Sensing Object Recognition
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 600
SEBA: Sample-Efficient Black-Box Attacks on Visual Reinforcement Learning
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 601
IAG: Input-aware Backdoor Attack on VLM-based Visual Grounding
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 602
DASH: A Meta-Attack Framework for Synthesizing Effective and Stealthy Adversarial Examples
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 603
AdapAction: Adaptive Target Action Backdoor Attack against GUI Agents
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 604
Phantom: Physical Object Interactions as Dynamic Triggers for NMS-Exploited Backdoors
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 605
Verifying Neural Network Robustness with Dual Perturbations
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 606
Defending Unauthorized Model Merging via Dual-Stage Weight Protection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 608
On the Role of Temporal Granularity in the Robustness of Spiking Neural Networks
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 609
Boosting Vision-Language-Action Finetuning with Feasible Action Neighborhood Prior
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 610
Exploring Conditions for Diffusion Models in Robotic Control
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 611
A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 612
Efficient Hybrid SE(3)-Equivariant Visuomotor Flow Policy via Spherical Harmonics for Robot Manipulation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 614
Scaling Spatial and Temporal Context for Robotic Imitation Learning Policies With Scene Graphs
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 615
AdaDexTrack: Dynamic Modulation for Adaptive and Generalizable Dexterous Manipulation Tracking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 616
GraspLDP: Towards Generalizable Grasping Policy via Latent Diffusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 617
MoEActok: A MoE-based Action Tokenizer for Vision-Language-Action Models
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 618
A Cross-view Fusion Framework for Robust 6-DoF Grasp Pose Estimation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 619
SAVA-X: Ego-to-Exo Imitation Error Detection via Scene-Adaptive View Alignment and Bidirectional Cross View Fusion
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 620
PromptDepth: Efficient and Promptable Geometric 3D Vision Model for Embodied Intelligence
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 621
Gallant: Voxel Grid-based Humanoid Locomotion and Local-navigation across 3-D Constrained Terrains
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 622
PALM: Progress-Aware Policy Learning via Affordance Reasoning for Long-Horizon Robotic Manipulation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 623
IGen: Scalable Data Generation for Robot Learning from Open-World Images
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 624
Hypergraph-State Collaborative Reasoning for Multi-Object Tracking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 625
TGTrack: Temporal Generative Learning for Unified Single Object Tracking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 626
GeoMotion: Rethinking Motion Segmentation via Latent 4D Geometry
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 627
Generalizable Structure-Aware Keypoint Correspondence for Category-Unified 3D Single Object Tracking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 628
Generative Point Tracking and Forecasting
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 629
RAGTrack: Language-aware RGBT Tracking with Retrieval-Augmented Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 630
Dual-level Adaptation for Multi-Object Tracking: Building Test-Time Calibration from Experience and Intuition
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 631
GMT: Effective Global Framework for Multi-Target Multi-Camera Tracking
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 632
Bridging Brain and Semantics: A Hierarchical Framework for Semantically Enhanced fMRI-to-Video Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 633
GraPHFormer: A Multimodal Graph Persistent Homology Transformer for the Analysis of Neuroscience Morphologies
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 634
DARC: Dual Adjustment Reasoning with Counterfactuals for Trustworthy Chest X-ray Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 635
Every Error has Its Magnitude: Asymmetric Mistake Severity Training for Multiclass Multiple Instance Learning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 636
Phrase-grounded APO for Improving Chest X-ray Report Generation
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 637
Focus-to-Perceive Representation Learning: A Cognition-Inspired Hierarchical Framework for Endoscopic Video Analysis
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 638
OraPO: Oracle-educated Reinforcement Learning for Data-efficient and Factual Radiology Report Generation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 639
FluoCLIP: Stain-Aware Focus Quality Assessment in Fluorescence Microscopy
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 640
CryoKRAQEN: Kernel-Regularized Annealing for Quantized Embedding Networks in Cryo-EM Heterogeneous Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 641
Building Robust Vision Encoders for Cross-Dataset Evaluation in Immunofluorescent Microscopy
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 642
H2-Surv: Hierarchical Hyperbolic Multimodal Representation Learning for Survival Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 643
Dual-Level Hypergraph Generation for Addressing Feature Scarcity in Whole-Slide Image Classification
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 644
Temporal Inversion for Learning Interval Change in Chest X-Rays
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 645
JUMP-Hand: Learning Joint-wise Uncertainty to Gate Mixture of View Experts for Multi-View 3D Hand Reconstruction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 646
PAD-Hand: Physics-Aware Diffusion for Hand Motion Recovery
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 647
Anatomical Domain Shifts: Test-time Heterogeneous Adaptation for 3D Human Pose Prediction
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 648
Unlocking Motion from Large Vision Models with a Semantic and Kinematic Duality for Gait Recognition
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 649
Learning 3D Shape Fidelity Metric from Real-world Distortions
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 650
BarbieGait: An Identity-Consistent Synthetic Human Dataset with Versatile Cloth-Changing for Gait Recognition
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 651
FisherPoser: Human Motion Estimation from Sparse Observations with Hierarchical Region-Wise Fisher-Matrix Uncertainty Modeling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 652
EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 653
Ground Reaction Inertial Poser: Physics-based Human Motion Capture from Sparse IMUs and Insole Pressure Sensors
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 654
FUN REC Reconstructing Functional 3D Scenes from Egocentric Interaction Videos
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 655
VIMCAN: Visual-Inertial 3D Human Pose Estimation with Hybrid Mamba-Cross-Attention Network
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 656
Bringing Your Portrait to 3D Presence
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 657
FLOW: Feature-Level Optimal Warping for Generalized Remote Physiological Measurement
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 659
UniMMAD: Unified Multi-Modal and Multi-Class Anomaly Detection via MoE-Driven Feature Decompression
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 660
BUSSARD: Normalizing Flows for Bijective Universal Scene-Specific Anomalous Relationship Detection
[
Slides]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 661
Multi-Prototype Compactness and Boundary-Aware Synthesis for Unsupervised Anomaly Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 662
PDD: Manifold-Prior Diverse Distillation for Medical Anomaly Detection
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 663
Weakly Supervised Video Anomaly Detection with Anomaly-Connected Components and Intention Reasoning
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 664
SubspaceAD: Training-Free Few-Shot Anomaly Detection via Subspace Modeling
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 665
Learning Spatial-Temporal Consistency for 3D Semantic Scene Completion
[
Slides]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 666
Generalizing Visual Geometry Priors to Sparse Gaussian Occupancy Prediction
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 667
Deformable Gaussian Occupancy: Decoupling Rigid and Nonrigid Motion with Factorized Distillation
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 668
OccAny: Generalized Unconstrained Urban 3D Occupancy
[
Poster]
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 669
Dr.Occ: Depth- and Region-Guided 3D Occupancy from Surround-View Cameras for Autonomous Driving
Poster
Sat Jun 06 03:45 PM -- 05:45 PM (PDT) @ ExHall A & F 670
ShelfOcc: Native 3D Supervision beyond LiDAR for Vision-Based Occupancy Estimation
[
Poster]
Successful Page Load