Toggle Poster Visibility
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 1
A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 2
Adversarial Style Optimization: Enhancing VLM Jailbreaks by GRPO-based Stylistic Triggers Optimization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 3
ANTS: Adaptive Negative Textual Space Shaping for OOD Detection via Test-Time MLLM Understanding and Reasoning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 4
ARGUS: Defending Against Multimodal Indirect Prompt Injection via Steering Instruction-Following Behavior
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 5
TEAR: Temporal-aware Automated Red-teaming for Text-to-Video Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 6
ViT^3: Unlocking Test-Time Training in Vision
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 7
Black-box Membership Inference Attacks on the Pre-training Data of Image-generation Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 8
Data Leakage Detection and De-duplication in Large Scale Geospatial Image Datasets
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 9
RAVEN: Erasing Invisible Watermarks via Novel View Synthesis
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 10
LDP-Slicing: Local Differential Privacy for Images via Randomized Bit-Plane Slicing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 11
NOWA: Null-space Optical Watermark for Invisible Capture Fingerprinting and Tamper Localization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 12
Revisiting Geometric Obfuscation with Dual Convergent Lines for Privacy-Preserving Image Queries in Visual Localization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 14
Does YOLO Really Need to See Every Training Image in Every Epoch?
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 15
Fine-grained Image Aesthetic Assessment: Learning Discriminative Scores from Relative Ranks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 16
NuWa: Deriving Lightweight Class-Specific Vision Transformers for Edge Devices
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 17
Plant Taxonomy Meets Plant Counting: A Fine-Grained, Taxonomic Dataset for Counting Hundreds of Plant Species
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 18
Rethinking Dataset Distillation: Hard Truths about Soft Labels
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 19
Customized Fusion: A Closed-Loop Dynamic Network for Adaptive Multi-Task-Aware Infrared-Visible Image Fusion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 20
Dual Band Thermal Videography: Separating Time-Varying Reflection and Emission Near Ambient Conditions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 21
MetaSpectra+: A Compact Broadband Metasurface Camera for Snapshot Hyperspectral+ Imaging
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 22
Spectrum from Defocus: Fast Spectral Imaging with Chromatic Focal Stack
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 23
Towards Photorealistic and Efficient Bokeh Rendering via Diffusion Framework
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 24
UnReflectAnything: RGB-Only Highlight Removal by Rendering Synthetic Specular Supervision
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 25
AVGGT: Rethinking Global Attention for Accelerating VGGT
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 26
ManifoldNeuS: Manifold-aware View Optimizability for Pose-Free Neural Surface Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 27
LongStream: Long-Sequence Streaming Autoregressive Visual Geometry
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 28
RPGFusion: 4D Radar Prior-Guided Multi-Modal Fusion for 3D Detection
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 29
MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 30
JRM: Joint Reconstruction Model for Multiple Objects without Alignment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 32
FreeScale: Scaling 3D Scenes via Certainty-Aware Free-View Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 33
Complet4R: Geometric Complete 4D Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 34
Unblur-SLAM: Dense Neural SLAM for Blurry Inputs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 35
Learning Compact 3D Representations from Feed-Forward Novel View Synthesis
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 36
Fast Spatial Tracking with Visual Geometry Transformer
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 37
How Much 3D Do Video Foundation Models Encode?
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 38
MetroGS: Efficient and Stable Reconstruction of Geometrically Accurate High-Fidelity Large-Scale Scenes
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 39
RnG: A Unified Transformer for Complete 3D Modeling from Partial Observations
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 40
Long-Tail Internet Photo Reconstruction
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 41
Emergent Outlier View Rejection in Visual Geometry Grounded Transformers
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 42
Flow3r: Factored Flow Prediction for Scalable Visual Geometry Learning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 43
MultiBanana: A Challenging Benchmark for Multi-Reference Text-to-Image Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 44
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 45
Design Your Ad: Personalized Advertising Image and Text Generation with Unified Autoregressive Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 47
ConsistCompose: Unified Multimodal Layout Control for Image Composition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 48
A Training-Free Style-Personalization via SVD-Based Feature Decomposition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 49
Beyond Patches: Global-aware Autoregressive Model for Multimodal Few-Shot Font Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 50
ImageRAGTurbo: Towards One-step Text-to-Image Generation with Retrieval-Augmented Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 51
OmniSonic: Towards Universal and Holistic Audio Generation from Video and Text
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 53
Curriculum Group Policy Optimization: Adaptive Sampling for Unleashing the Potential of Text-to-Image Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 54
SplitFlux: Learning to Decouple Content and Style from a Single Image
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 55
FontCrafter: High-Fidelity Element-Driven Artistic Font Creation with Visual In-Context Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 56
EmoStyle: Emotion-Driven Image Stylization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 57
Text-Image Conditioned 3D Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 58
IntroSVG: Learning from Rendering Feedback for Text-to-SVG Generation via an Introspective Generator–Critic Framework
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 59
AnyDoc: Enhancing Document Generation via Large-Scale HTML/CSS Data Synthesis and Height-Aware Reinforcement Optimization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 60
Reasoning Diffusion for Unpaired Test Time Out-of-distribution Text-Image to Video Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 61
SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 62
STAGE: Storyboard-Anchored Generation for Cinematic Multi-shot Narrative
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 63
MTA: Multimodal Task Alignment for BEV Perception and Captioning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 64
β-CLIP: Text-Conditioned Contrastive Learning for Multi-Granular Vision-Language Alignment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 65
SafeRoPE: Risk-specific Head-wise Embedding Rotation for Safe Generation in Rectified Flow Transformers
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 66
FALCON: False-Negative Aware Learning of Contrastive Negatives in Vision-Language Alignment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 67
Spatial-Aware VLA Pretraining through Visual-Physical Alignment from Human Videos
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 69
Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 70
EMO-R3: Reflective Reinforcement Learning for Emotional Reasoning in Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 71
EvoGraph-R1: Self-Evolving Multimodal Knowledge Hypergraphs for Agentic Retrieval
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 72
Cross-modal Identity Mapping: Minimizing Information Loss in Modality Conversion via Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 73
Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 74
Stabilizing Feature Geometry in Noisy Pretrained Models for Robust Downstream Tasks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 75
Black-Box Domain Adaptation for Object Detection with Retention-Driven Knowledge Compression
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 76
Decoupled and Reusable Adaptation for Efficient Cross-Modal Transfer
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 77
Preference-Aligned LoRA Merging: Preserving Subspace Coverage and Addressing Directional Anisotropy
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 78
Curvature-Aware Zeroth-Order Optimization for Memory-Efficient Test-Time Adaptation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 79
Label-Free Cross-Task LoRA Merging with Null-Space Compression
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 80
Basis-Oriented Low-rank Transfer for Few-Shot and Test-Time Adaptation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 81
GeCo: Geometry-Consistent Regularization for Domain Generalized Semantic Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 82
Event-based Motion Deblurring with Unpaired Data
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 83
Stable Spike: Dual Consistency Optimization via Bitwise AND Operations for Spiking Neural Networks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 84
Event-based Visual Deformation Measurement
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 85
Bidirectional Cross-Modal Prompting for Event-Frame Asymmetric Stereo
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 86
SpikeTrack: High-performance and Energy-efficient Event-Based Object Tracking with Spiking Neural Network
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 87
Event Structural Valley: A Unified Theoretical and Practical Framework for Event Camera Autofocus
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 89
Do You Have Freestyle? Expressive Humanoid Locomotion via Audio Control
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 90
CLaD: Planning with Grounded Foresight via Cross-Modal Latent Dynamics
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 91
InternData-A1: Pioneering High-Fidelity Synthetic Data for Pre-training Generalist Policy
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 92
DemoFunGrasp: Universal Dexterous Functional Grasping via Demonstration-Editing Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 93
GeniNav: Generative Model Driven Image-Goal Navigation via Imagination-Guided Consistency Flow Matching
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 94
Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 95
DRAMA: Next-Gen Dynamic Orchestration for Resilient Multi-Agent Ecosystems in Flux
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 96
Arcadia: Toward a Full-Lifecycle Framework for Embodied Lifelong Learning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 97
Wanderland: Geometrically Grounded Simulation for Open-World Embodied AI
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 98
ORV: 4D Occupancy-centric Robot Video Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 99
DextER: Language-driven Dexterous Grasp Generation with Embodied Reasoning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 100
Language-Free Generative Editing from One Visual Example
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 101
Omni IIE Bench: Benchmarking the Practical Capabilities of Image Editing Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 102
LuxRemix: Lighting Decomposition and Remixing for Indoor Scenes
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 103
CompBench: Benchmarking Complex Instruction-guided Image Editing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 104
Garments2Look: A Multi-Reference Dataset for High-Fidelity Outfit-Level Virtual Try-On with Clothing and Accessories
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 105
Learning Personalized Photographic Style from Pairwise User Preferences
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 106
CogniEdit: Dense Gradient Flow Optimization for Fine-Grained Image Editing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 108
MOSAIC-GS: Monocular Scene Reconstruction via Advanced Initialization for Complex Dynamic Environments
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 109
REArtGS++: Generalizable Articulation Reconstruction with Temporal Geometry Constraint via Planar Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 110
Dynamic-eDiTor: Training-Free Text-Driven 4D Scene Editing with Multimodal Diffusion Transformer
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 111
FaithFusion: Harmonizing Reconstruction and Generation via Pixel-wise Information Gain
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 112
IR-HGP: Physically-Aware Gaussian Inverse Rendering for High-Illumination Scenes via Generative Priors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 113
Seeing through boxes: Non-Line-of-Sight 3D Reconstruction from Radar Signals
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 114
Speeding Up the Learning of 3D Gaussians with Much Shorter Gaussian Lists
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 115
DynamicTree: Interactive Real Tree Animation via Sparse Voxel Spectrum
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 116
WildRayZer: Self-supervised Large View Synthesis in Dynamic Environments
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 117
DGGT: Feedforward 4D Reconstruction of Dynamic Driving Scenes using Unposed Images
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 118
Retrieve-to-Restore: Efficient All-in-One Image Restoration with a Retrieval-Based Degradation Bank
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 119
MRI Contrast Enhancement Kinetics World Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 120
ReflexSplit: Single Image Reflection Separation via Layer Fusion-Separation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 121
Rethinking Knowledge Transfer in Image Quality Assessment: A Perceptual Preference Structure Alignment Perspective
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 122
ZeroIDIR: Zero-Reference Illumination Degradation Image Restoration with Perturbed Consistency Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 123
White-Balance First, Adjust Later: Cross-Camera Color Constancy via Vision-Language Evaluation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 124
Unpaired Image Deraining Using Reward-Guided Self-Reinforcement Strategy
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 125
LF-BVN: Blind-View Network for Self-Supervised Light Field Denoising
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 128
Towards Generalized Representations for Low-Light Understanding: When Signal Constancy Meets Semantic Enrichment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 129
Synergistic Bleeding Region and Point Detection in Laparoscopic Surgical Videos
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 130
MedCLIPSeg: Probabilistic Vision-Language Adaptation for Data-Efficient and Generalizable Medical Image Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 131
AD-GBC: Anisotropic Granular-Ball Skip-Connection Refiner for UNet-Based Medical Image Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 132
OSA: Echocardiography Video Segmentation via Orthogonalized State Update and Anatomical Prior-aware Feature Enhancement
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 133
VesMamba: 3D Pulmonary Vessel Segmentation from CT images via Mamba with Structural Perception and Scale-aware Filtering
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 134
SemiGDA: Generative Dual-distribution Alignment for Semi-Supervised Medical Image Segmentation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 135
Diffusion-Based Native Adversarial Synthesis for Enhanced Medical Segmentation Generalization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 136
CG-Reasoner: Centroid-Guided Positional Reasoning Segmentation for Medical Imaging with a Robust Visual-Text Consistency Metric
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 137
Instruction-Guided Lesion Segmentation for Chest X-rays with Automatically Generated Large-Scale Dataset
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 138
Towards Highly Transferable Vision-Language Attack via Semantic-Augmented Dynamic Contrastive Interaction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 139
Towards Human-Imperceptible Backdoor Attacks on Text-to-Image Diffusion Models
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 140
TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 141
DualMirage: Hunting Stealthy Multimodal LLM Agents via CAPTCHAs with Contour and Adversarial Illusions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 142
Models as Lego Builders: Assembling Malice from Benign Blocks via Semantic Blueprints
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 143
Source Models Leak What They Shouldn’t: Unlearning Zero-Shot Transfer in Domain Adaptation Through Adversarial Optimization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 144
A Unified Perspective on Adversarial Membership Manipulation in Vision Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 145
Shedding Light on VLN Robustness: A Black-box Framework for Indoor Lighting-based Adversarial Attack
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 146
OddGridBench: Exposing the Lack of Fine-Grained Visual Discrepancy Sensitivity in Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 147
Beyond What's Shared: Recovering Lost Unique Information from Intermediate Layers to Boost Multimodal Geo-Foundation Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 148
WikiCLIP: An Efficient Contrastive Baseline for Open-domain Visual Entity Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 149
CLCR: Cross-Level Semantic Collaborative Representation for Multimodal Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 150
Learning Anchor in Dual Orthogonal Space for Fast Multi-view Clustering
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 153
FAVE: A Structured Benchmark for Fine-Grained Audio-Visual Temporal Evaluation in Multimodal LLMs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 154
Omni2Sound: Towards Unified Video-Text-to-Audio Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 155
EmoThinker: Advancing Visual-Acoustic Emotion Analysis via Structural Token Selection and Chain-of-Thought Reasoning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 156
Enhancing Descriptive Captions with Visual Attributes for Multimodal Perception
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 158
Vision-Speech Models: Teaching Speech Models to Converse about Images
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 159
EMMA: Extracting Multiple physical parameters from Multimodal Data
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 160
MMGait: Towards Multi-Modal Gait Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 161
OSMO: Open-vocabulary Self-eMOtion Tracking
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 162
MuCo: Multi-turn Contrastive Learning for Multimodal Embedding Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 164
Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 165
Active Perceptual Inference: A Corticothalamic-Inspired Dynamic Nested Recurrent Network for Multimodal Sentiment Analysis with Incomplete Data
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 166
Scalable Trajectory Generation for Whole-Body Mobile Manipulation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 167
Breaking the 3D Dataset Bottleneck: Fast Scalable Generation of Aligned 3D Assets from Scratch for Category 6D Pose Estimation and Robotic Grasping
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 168
Real-Time Multimodal Fingertip Contact Detection via Depth and Motion Fusion for Vision-Based Human–Computer Interaction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 169
Glove2Hand: Synthesizing Natural Hand-Object Interaction from Multi-Modal Sensing Gloves
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 170
UniDex: A Robot Foundation Suite for Universal Dexterous Hand Control from Egocentric Human Videos
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 171
ConsID-Gen: View-Consistent and Identity-Preserving Image-to-Video Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 172
DiverseGRPO: Mitigating Mode Collapse in Image Generation via Diversity-Aware GRPO
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 173
VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 174
Video Generation with Stable Transparency via Shiftable RGB-A Distribution Learner
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 175
MOFA-VTON: More Fashion Possibilities with Fine-Grained Adaptations in Virtual Try-On
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 176
Scaling Multi-Identity Consistency for Image Customization via Multi-to-Multi Matching Paradigm
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 177
NOVA: Sparse Control, Dense Synthesis for Pair-Free Video Editing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 178
Functional Mean Flow in Hilbert Space
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 179
Benchmarking Single-Factor Physical Video-to-Audio Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 180
UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 181
Refaçade: Editing Object with Given Reference Texture
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 182
Free-Lunch Long Video Generation via Layer-Adaptive O.O.D Correction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 183
Not All Birds Look The Same: Identity-Preserving Generation For Birds
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 184
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 185
EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 186
Clothe and Pose
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 187
FlowPortal: Residual-Corrected Flow for Training-Free Video Relighting and Background Replacement
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 188
The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 189
Rethinking UMM Visual Generation: Masked Modeling for Efficient Image-Only Pre-training
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 191
Bidirectional Normalizing Flow: From Data to Noise and Back
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 192
ShotDirector: Directorially Controllable Multi-Shot Video Generation with Cinematographic Transitions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 193
Are Image-to-Video Models Good Zero-Shot Image Editors?
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 194
FastLightGen: Fast and Light Video Generation with Fewer Steps and Parameters
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 195
Unified Latent Space for Understanding and Generation via Semantic Auto-encoder
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 196
AHS: Adaptive Head Synthesis via Synthetic Data Augmentations
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 197
CASR: A Robust Cyclic Framework for Arbitrary Large-Scale Super-Resolution with Distribution Alignment and Self-Similarity Awareness
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 198
Thermal Diffusion Matters: Infrared Spatial-Temporal Video Super-Resolution through Heat Conduction Priors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 199
TextOVSR: Text-Guided Real-World Opera Video Super-Resolution
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 200
VoDaSuRe: A Large-Scale Dataset Revealing Domain Shift in Volumetric Super-Resolution
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 201
GDPO-SR: Group Direct Preference Optimization for One-Step Generative Image Super-Resolution
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 202
Adaptive Anisotropic Gaussian Splatting for Multi-contrast MRI Arbitrary-Scale Super-Resolution with Anatomy Guidance
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 203
SignPR: A Progressive Vector-Quantized Diffusion Framework for Sign Language Production
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 204
LLaMo: Scaling Pretrained Language Models for Unified Motion Understanding and Generation with Continuous Autoregressive Tokens
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 205
FlashCap: Millisecond-Accurate Human Motion Capture via Flashing LEDs and Event-Based Vision
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 206
Geometric Neural Distance Fields for Learning Human Motion Priors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 207
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 208
Decoupled Generative Modeling for Human-Object Interaction Synthesis
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 209
LiveGesture: Streamable Co-Speech Gesture Generation Model
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 210
HandX: Scaling Bimanual Motion and Interaction Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 211
MaskAdapt: Learning Flexible Motion Adaptation via Mask-Invariant Prior for Physics-Based Characters
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 212
FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 213
ProjFlow: Projection Sampling with Flow Matching for Zero‑Shot Exact Spatial Motion Control
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 214
Correspondence-Attention Alignment for Multi-View Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 215
GenErase: Generalizable and Semantically-Aware Concept Erasure in Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 216
MatMart: Material Reconstruction of 3D Objects via Diffusion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 217
Region-Adaptive Sampling for Diffusion Transformers
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 218
Diffusion Guided Chain-of-Vision for Large Autoregressive Vision Models
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 219
Guiding Diffusion-based Reconstruction with Contrastive Signals for Balanced Visual Representation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 220
ConceptPrism: Concept Disentanglement in Personalized Diffusion Models via Residual Token Optimization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 221
Heterogeneous Decentralized Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 222
Refining Few-Step Text-to-Multiview Diffusion via Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 223
GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 224
ENC-Bench: A Benchmark for Evaluating Multimodal Large Language Models in Electronic Navigational Chart Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 226
RealBirdID: Benchmarking Bird Species Identification in the Era of MLLMs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 228
PP-OCRv5: A Specialized 5M-Parameter Model Rivaling Billion-Parameter Vision-Language Models on OCR Tasks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 229
World in a Frame: Understanding Culture Mixing as a New Challenge for Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 230
Gastric-X: A Multimodal Multi-Phase Benchmark Dataset for Advancing Vision-Language Models in Gastric Cancer Analysis
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 231
HiSpatial: Taming Hierarchical 3D Spatial Understanding in Vision-Language Models
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 232
HandVQA: Diagnosing and Improving Fine-Grained Spatial Reasoning about Hands in Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 233
Probing and Bridging Geometry–Interaction Cues for Affordance Reasoning in Vision Foundation Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 234
ARC Is a Vision Problem!
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 235
Geoint-R1: Formalizing Multimodal Geometric Reasoning with Dynamic Auxiliary Constructions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 236
S^2-MLLM: Boosting Spatial Reasoning Capability of MLLMs for 3D Visual Grounding with Structural Guidance
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 237
Learning Multi-View Spatial Reasoning from Cross-View Relations
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 238
Exploring Spatial Intelligence from a Generative Perspective
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 239
Physical Object Understanding with a Physically Controllable World Model
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 240
QueryMe: Query-Driven Open-Vocabulary 3D Object Affordances Grounding from Multimodal Evidence
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 241
Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 242
EG-3DVG: Expression and Geometry Aware Grounding Decoder for 3D Visual Grounding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 243
AffordMatcher: Affordance Learning in 3D Scenes from Visual Signifiers
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 244
SpatiaLQA: A Benchmark for Evaluating Spatial Logical Reasoning in Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 245
Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 246
Intra-class Distribution-guided Generative Hashing with Neighbor Refinement for Cross-modal Retrieval
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 247
Language-driven Fine-grained Retrieval
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 248
MRD: Multi-resolution Retrieval-Detection Fusion for High-Resolution Image Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 249
RetFormer: Multimodal Retrieval for Enhancing Image Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 250
DREAM: Document Recognition with Explicit Adaptive Memory
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 251
RMIR: A Benchmark Dataset for Reasoning-Intensive Multimodal Image Retrieval
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 252
POGA: Paraphrased and Oppositional Graph Alignment for Fine-Grained Cross-Modal Retrieval
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 253
Chain-of-Frames: Advancing Video Understanding in Multimodal LLMs via Frame-Aware Reasoning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 254
TempR1: Improving Temporal Understanding of MLLMs via Temporal-Aware Multi-Task Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 255
RiskProp: Collision-Anchored Self-Supervised Risk Propagation For Early Accident Anticipation
[
Slides]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 256
MotionEnhancer: Leveraging Video Diffusion for Motion-Enhanced Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 257
MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 258
Asynchronous Temporal Modeling with Two-Agent Framework for Streaming Dense Video Captioning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 259
TRCoRSurg: Temporal-Relational Co-Reasoning for Surgical Video Triplet Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 260
OASIS: On-Demand Hierarchical Event Memory for Streaming Video Reasoning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 261
One-Shot Flow, Any-Time Frame: A Bidirectional Warping Framework for Event-Based Video Frame Interpolation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 262
TF-CADE: Foreground-Concentrated Text-Video Alignment for Zero-Shot Temporal Action Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 263
PRISM: Prototype-based Reasoning with Inter-modal Semantic Mining for Interpretable Image Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 264
Concept Regions Matter: Benchmarking CLIP with a New Cluster-Importance Approach
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 265
PhaseWin Search Framework Enable Efficient Object-Level Interpretation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 266
Beyond Top Activations: Efficient and Reliable Crowdsourced Evaluation of Automated Interpretability
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 267
From Weights to Concepts: Data-Free Interpretability of CLIP via Singular Vector Decomposition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 268
Hierarchical Concept Embedding & Pursuit for Interpretable Image Classification
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 269
Interpretable and Steerable Concept Bottleneck Sparse Autoencoders
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 270
C-LaV: Conditional Latent Velocity Field Denoising for Weather-Robust LiDAR Place Recognition
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 271
Towards Foundation Models for 3D Scene Understanding: Instance-Aware Self-Supervised Learning for Point Clouds
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 272
Generalized-CVO: Fast and Correspondence-Free Local Point Cloud Registration with Second Order Riemannian Optimization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 273
LiDeRe: A Lightweight Readout for Fast and Data-Efficient Dense Prediction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 275
CoLC: Communication-Efficient Collaborative Perception with LiDAR Completion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 276
Spectral-Geometric Neural Fields for Pose-Free LiDAR View Synthesis
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 277
C-GenReg: Training-Free 3D Point Cloud Registration by Multi-View-Consistent Geometry-to-Image Generation with Probabilistic Modalities Fusion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 278
PatchAlign3D: Local Feature Alignment for Dense 3D Shape Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 279
FoV-Net: Rotation-Invariant CAD B-rep Learning via Field-of-View Ray Casting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 280
Neural Distribution Prior for LiDAR Out-of-Distribution Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 281
DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 282
Concept-Aware Batch Sampling Improves Language-Image Pretraining
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 283
HiFICL: High-Fidelity In-Context Learning for Multimodal Tasks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 284
InstAP: Instance-Aware Vision-Language Pre-Train for Spatial-Temporal Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 285
Vocabulary Scaling Law: Tuning Open-vocabulary Predictors for Their Openness
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 286
Render-to-Adapt: Unsupervised Personal Adaptation for Gaze Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 287
ViTPrompt: Training-Free Prompt Refinement with Visual Tokens for Open-Vocabulary Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 288
Cluster-Aware Neural Collapse Prompt Tuning for Long-Tailed Generalization of Vision-Language Models
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 289
LLMind: Bio-inspired Training-free Adaptive Visual Representations for Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 290
Dynamic Logits Adjustment and Exploration for Test-Time Adaptation in Vision Language Models
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 291
CAPT: Confusion-Aware Prompt Tuning for Reducing Vision-Language Misalignment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 292
GenMatter: Perceiving Physical Objects with Generative Matter Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 293
Bidirectional Query-Driven Generation of Parametric CAD Sketch
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 294
The Missing GAP: From Solving Square Jigsaw Puzzles to Handling Real World Archaeological Fragments
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 295
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 296
OmniDocLayout: Towards Diverse Document Layout Generation via Coarse-to-Fine LLM Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 297
Yo'City: Personalized and Boundless 3D Realistic City Scene Generation via Self-Critic Expansion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 298
Repurposing 3D Generative Model for Autoregressive Layout Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 299
CAD-Refiner: A Unified Framework for CAD Generation and Iterative Editing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 300
A Debiased Reconstruction-based Framework for Training-Free Detection of AI-Generated Images
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 301
Global Information Thresholding for Sufficient and Necessary Circuits
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 302
PrivateEyes: Gaze-Preserving Anonymization for Data Sharing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 303
From Measurement to Mitigation: Quantifying and Reducing Identity Leakage in Image Representation Encoders with Linear Subspace Removal
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 304
Bias In, Bias Out? Finding Unbiased Subnetworks in Vanilla Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 305
pH-Strips for Selective Forgetting: A Blunt but Fast Diagnostic Baseline for Machine Unlearning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 306
Decoupling Defense Strategies for Robust Image Watermarking
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 307
Unsafe2Safe: Controllable Image Anonymization for Downstream Utility
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 310
DP-FedAdamW: An Efficient Optimizer for Differentially Private Federated Large Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 311
Submodel Extraction for Efficient and Personalized Federated Learning via Optimal Transport
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 313
FedDAP: Domain-Aware Prototype Learning for Federated Learning under Domain Shift
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 314
FedAFD: Multimodal Federated Learning via Adversarial Fusion and Distillation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 315
VIRST: Video-Instructed Reasoning Assistant for SpatioTemporal Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 317
Stay in your Lane: Role Specific Queries with Overlap Suppression Loss for Dense Video Captioning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 318
T2SGrid: Temporal-to-Spatial Gridification for Video Temporal Grounding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 319
HanDyVQA: A Video QA Benchmark for Fine-Grained Hand-Object Interaction Dynamics
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 320
SAIL: Similarity-Aware Guidance and Inter-Caption Augmentation-based Learning for Weakly-Supervised Dense Video Captioning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 321
Token Warping Helps MLLMs Look from Nearby Viewpoints
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 322
Variation-aware Vision Token Dropping for Faster Large Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 323
Fine-Grained Post-Training Quantization for Large Vision Language Models with Quantization-Aware Integrated Gradients
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 324
Blink: Dynamic Visual Token Resolution for Enhanced Multimodal Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 325
IF-Prune: Information-Flow Guided Token Pruning for Efficient Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 326
EvoComp: Learning Visual Token Compression for Multimodal Large Language Models via Semantic-Guided Evolutionary Labeling
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 327
DocPrune: Efficient Document Question Answering via Background, Question, and Comprehension-aware Token Pruning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 328
QuietPrune: Query-Guided Early Token Pruning for Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 329
The Devil Is in Gradient Entanglement: Energy-Aware Gradient Coordinator for Robust Generalized Category Discovery
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 330
LLM-Guided Probabilistic Fusion for Label-Efficient Document Layout Analysis
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 331
Coordinate Denoising for Non‑Equilibrium Molecular Representation Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 332
Plug-and-Play Incomplete Multi-View Clustering via Janus-Faced Affinity Learning with Topology Harmonization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 333
Meta-Learning In-Context Enables Training-Free Cross Subject Brain Decoding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 334
Measure The Feature Universe: Topology-based Pseudo Labeling and Gravity Consistency for Source-Free Domain Adaptation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 335
Conditional Factuality Controlled LLMs with Generalization Certificates via Conformal Sampling
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 336
Harnessing the Power of Foundation Models for Accurate Material Classification
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 337
Content-Aware Frequency Encoding for Implicit Neural Representations with Fourier-Chebyshev Features
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 338
ActiveAD: Planning-Oriented Active Learning for End-to-End Autonomous Driving
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 339
TeFlow: Enabling Multi-frame Supervision for Self-Supervised Feed-forward Scene Flow Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 340
Think Before You Drive: World Model-Inspired Multimodal Grounding
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 342
DrivePTS: A Progressive Learning Framework with Textual and Structural Enhancement for Driving Scene Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 343
WOD-E2E: Waymo Open Dataset for End-to-End Driving in Challenging Long-tail Scenarios
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 344
GuideFlow: Constraint-Guided Flow Matching for Planning in End-to-End Autonomous Driving
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 345
ResAD: Normalized Residual Trajectory Modeling for End-to-End Autonomous Driving
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 346
KnowVal: A Knowledge-Augmented and Value-Guided Autonomous Driving System
[
Slides]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 347
FoSS: Modeling Long-Range Dependencies and Multimodal Uncertainty in Trajectory Prediction via Fourier–State Space Integration
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 348
NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow Networks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 349
Visual Prototype Conditioned Focal Region Generation for UAV-Based Object Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 350
Consistent Instance Field for Dynamic Scene Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 351
CLP: A Real-World Dataset of Contaminated Lens Protectors for Robust Semantic Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 353
Heuristic Self-Paced Learning for Domain Adaptive Semantic Segmentation under Adverse Conditions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 354
SAM2Text: Towards Prompt-Free and Multi-Resolution Video Scene Text Segmentation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 355
Reinforcing Video Reasoning Segmentation to Think Before It Segments
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 356
VideoMaMa: Mask-Guided Video Matting via Generative Prior
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 357
Quantized Residuals to Continuous Prompts for Few-Shot Class Incremental Learning in Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 358
The Golden Subspace: Where Efficiency Meets Generalization in Continual Test-Time Adaptation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 359
SAIDO: Generalizable Detection of AI-Generated Images via Scene-Aware and Importance-Guided Dynamic Optimization in Continual Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 360
Is Parameter Isolation Better for Prompt-Based Continual Learning?
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 361
Octopus: History-Free Gradient Orthogonalization for Continual Learning in Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 362
Affordance-First Decomposition for Continual Learning in Video–Language Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 363
Quantum-Gated Task-interaction Knowledge Distillation for Pre-trained Model-based Class-Incremental Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 364
Elastic Weight Consolidation Done Right for Continual Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 365
On Token's Dilemma: Dynamic MoE with Drift-Aware Token Assignment for Continual Learning of Large Vision Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 366
Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal Animation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 367
Talking Together: Synthesizing Co-Located 3D Conversations from Audio
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 368
InfinityHuman: Towards Long-Term Audio-Driven Human Animation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 369
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 370
AudioAvatar: Personalized Audio-driven Whole-body Talking Avatars
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 371
One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 372
Counterfactual VLA: Self-Reflective Vision-Language-Action Model with Adaptive Reasoning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 373
SGDrive: Scene-to-Goal Hierarchical World Cognition for Autonomous Driving
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 374
CapNav: Benchmarking Vision Language Models on Capability-conditioned Indoor Navigation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 375
AutoTraces: Autoregressive Trajectory Forecasting via Multimodal Large Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 376
AwareVLN: Reasoning with Self-awareness for Vision-Language Navigation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 377
Progress-Think: Semantic Progress Reasoning for Vision-Language Navigation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 378
Tavatar: Topology-Aware Gaussian Attribute Derivation for Animatable Human Avatars
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 379
PercHead: Perceptual Head Model for Single-Image 3D Head Reconstruction & Editing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 380
PhysHead: Simulation-Ready Gaussian Head Avatars
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 381
ReWeaver: Towards Simulation-Ready and Topology-Accurate Garment Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 382
FHAvatar: Fast and High-Fidelity Reconstruction of Face-and-Hair Composable 3D Head Avatar from Few Casual Captures
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 383
Feed-Forward One-Shot Animatable Textured Mesh Avatar Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 384
Reallocating Attention Across Layers to Reduce Multimodal Hallucination
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 385
VES-RFT: Rewarding Visual Evidence Sensitivity to Mitigate Hallucinations in Large Vision–Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 386
Fighting Hallucinations with Counterfactuals: Diffusion-Guided Perturbations for LVLM Hallucination Suppression
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 387
Unstitching the Chimera: Frame-Level Risk and Train-Free Mitigation for Video Hallucination
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 389
Breaking the Illusion: When Positive Meets Negative in Multimodal Decoding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 390
FlexTraj: Image-to-Video Generation with Flexible Point Trajectory Control
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 391
Diff4Splat: Repurposing Video Diffusion Models for Dynamic Scene Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 392
Spatia: Video Generation with Updatable Spatial Memory
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 393
Geometry-as-context: Modulating Explicit 3D in Scene-consistent Video Generation to Geometry Context
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 394
EgoControl: Controllable Egocentric Video Generation via 3D Full-Body Poses
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 395
CustomTex: High-fidelity Indoor Scene Texturing via Multi-Reference Customization
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 396
FoleyDesigner: Immersive Stereo Foley Generation with Precise Spatio-Temporal Alignment for Film Clips
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 397
Physical Simulator In-the-Loop Video Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 398
Refracting Reality: Generating Images with Realistic Transparent Objects
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 399
Generating Humanless Environment Walkthroughs from Egocentric Walking Tour Videos
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 400
EgoFlow: Gradient-Guided Flow Matching for Egocentric 6DoF Object Motion Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 401
Spatial-Frequency Collaborative Learning for Occluded Visible-Infrared Person Re-Identification
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 402
Mind the Gap: Transferring Labels to Align Object Detection Datasets
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 403
SSM-Aware Token-Efficient VMamba via Adaptive Patch Pruning and Merging for Person Re-Identification
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 404
Tri-Modal Fusion Transformers for UAV-based Object Detection
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 405
View-Aware Semantic Alignment for Aerial-Ground Person Re-Identification
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 406
RHCNet: Residual-Guided Hierarchical Calibration Network for Robust Underwater Object Detection
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 407
X-AVDT: Audio-Visual Cross-Attention for Robust Deepfake Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 408
Beyond Duality: A Hybrid Framework of Leveraging Shared and Private Features for RGB-Event Object Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 409
FVBench: Benchmarking Deepfake Video Detection Capability of Large Multimodal Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 410
AKCMamba-YOLO: Selective State Space Models For Real-Time Object Detection
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 411
When AVSR Meets Video Conferencing: Dataset, Degradation, and the Hidden Mechanism Behind Performance Collapse
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 412
Your One-Stop Solution for AI-Generated Video Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 413
UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 414
Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 415
HumanVBench: Probing Human-Centric Video Understanding in MLLMs with Automatically Synthesized Benchmarks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 416
HERBench: A Benchmark for Multi-Evidence Integration in Video Question Answering
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 417
Seeing the Scene Matters: Revealing Forgetting in Video Understanding Models with a Scene-Aware Long-Video Benchmark
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 418
Thinking with Frames: Generative Video Distortion Evaluation via Frame Reward Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 419
MovieRecapsQA: A Multimodal Open-Ended Video Question-Answering Benchmark
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 420
Training-free, Perceptually Consistent Low-Resolution Previews with High-Resolution Image for Efficient Workflows of Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 421
One Model, Many Budgets: Elastic Latent Interfaces for Diffusion Transformers
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 422
Reflection Separation from a Single Image via Joint Latent Diffusion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 423
MMFace-DiT: A Dual-Stream Diffusion Transformer for High-Fidelity Multimodal Face Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 424
DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 426
VMonarch: Efficient Video Diffusion Transformers with Structured Attention
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 427
DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 428
Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 429
Transition Matching Distillation for Fast Video Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 430
Diffusion-Based Makeup Transfer with Facial Region-Aware Makeup Features
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 431
UniPR: Unified Object-level Real-to-Sim Perception and Reconstruction from a Single Stereo Pair
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 432
Query2Uncertainty: Robust Uncertainty Quantification and Calibration for 3D Object Detection under Distribution Shift
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 433
DICArt: Advancing Category-level Articulated Object Pose Estimation in Discrete State-Spaces
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 434
PoseGaussian: 6D Pose Estimation for Unseen Objects via Sparse-View Object-Level 3D Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 435
VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 436
MonoSAOD: Monocular 3D Object Detection with Sparsely Annotated Label
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 437
V2U4Real: A Real-world Large-scale Dataset for Vehicle-to-UAV Cooperative Perception
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 438
SketchVL: Policy Optimization via Fine-Grained Credit Assignment for Chart Understanding and More
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 439
A Causal Marriage between VLM and IRM from Understanding to Reasoning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 440
Why Does RL Generalize Better Than SFT? A Data-Centric Perspective on VLM Post-Training
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 441
SoC: Semantic Orthogonal Calibration for Test-Time Prompt Tuning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 442
Learning to Select Visual Tools from Experience
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 443
Agile Deliberation: Concept Deliberation for Subjective Visual Classification
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 444
Tea-Adapter: Teacher Adapter for Efficient Conditional Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 445
From Failure to Feedback: Group Revision Unlocks Hard Cases in Object-Level Grounding
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 446
Perception Characteristics Distance: Measuring Stability and Robustness of Perception System in Dynamic Conditions under a Certain Decision Rule
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 447
FinPercep-RM: A Fine-grained Reward Model and Co-evolutionary Curriculum for RL-based Real-world Super-Resolution
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 448
Twin-T & TwintVQA: A Reliable Structure–Detail Separating VLM and a Comprehensive Benchmark for Chart and Table Tasks
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 449
SDGS: Spatial Difference Guided Gaussian Splatting for Simultaneous Localization and 3D Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 450
RT-Splatting: Joint Reflection-Transmission Modeling with Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 451
Pose-Free Omnidirectional Gaussian Splatting for 360-Degree Videos with Consistent Depth Priors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 452
Distilling Unsigned Distance Function for Surface Reconstruction from 3D Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 453
Exact-GS: Mathematically Rigorous and Accurate 3D Gaussian Splatting for 3D X-ray Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 454
DualSplat: Robust 3D Gaussian Splatting via Pseudo-Mask Bootstrapping from Reconstruction Failures
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 455
E2EGS: Event-to-Edge Gaussian Splatting for Pose-Free 3D Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 456
Neural Gabor Splatting: Enhanced Gaussian Splatting with Neural Gabor for High-frequency Surface Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 457
DirectFisheye-GS: Enabling Native Fisheye Input in Gaussian Splatting with Cross-View Joint Optimization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 458
VAD-GS: Visibility-Aware Densification for 3D Gaussian Splatting in Dynamic Urban Scenes
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 459
GauMVC: Generative Decoupled Gaussian Representation for Human-centric Multi-view Video Compression
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 460
A Geometric Algebra-Informed 3DGS Framework for Wireless Channel Prediction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 461
RaGS: Unleashing 3D Gaussian Splatting from 4D Radar and Monocular Cue for 3D Object Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 462
Cross-Instance Gaussian Splatting Registration via Geometry-Aware Feature-Guided Alignment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 463
ActivePolicy: Active Gaussian Reconstruction and Optimization Strategy Based on Global-Local Information Gain
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 464
Uncertainty-driven 3D Gaussian Splatting Active Mapping via Anisotropic Visibility Field
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 465
SV-GS: Sparse View 4D Reconstruction with Skeleton-Driven Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 466
NimbusGS: Unified 3D Scene Reconstruction under Hybrid Weather
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 467
SparseSplat: Towards Applicable Feed-Forward 3D Gaussian Splatting with Pixel-Unaligned Prediction
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 468
REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 469
Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 470
Unlocking Token Rewards via Training-Free Reward Attribution
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 472
When to Think and When to Look: Uncertainty-Guided Lookback
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 473
StaR-KVQA: Structured Reasoning Traces for Implicit-Knowledge Visual Question Answering
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 474
Understanding Counting Mechanisms in Large Language and Vision-Language Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 475
CLiViS: Unleashing Cognitive Map through Linguistic-Visual Synergy for Embodied Visual Reasoning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 476
Proof-of-Perception: Certified Tool-Using Multimodal Reasoning with Compositional Conformal Guarantees
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 479
Hugging Visual Prompt and Segmentation Tokens: Consistency Learning for Fine-Grained Visual Understanding in MLLMs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 480
VisionLeaf: Entropy-Guided Leaf-First Reasoning for Efficient and Accurate Think-with-Image
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 481
GGBench: A Geometric Generative Reasoning Benchmark for Unified Multimodal Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 482
Beyond Depth: Evaluating the Width-centric Reasoning Capability of MLLMs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 483
GenSplat: Bridging the Generalization Gap in 3DGS Language Comprehension
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 484
CC-VQA: Conflict- and Correlation-Aware Method for Mitigating Knowledge Conflict in Knowledge-Based Visual Question Answering
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 485
LoPrune: Efficient Data Pruning for LoRA-Based Fine-Tuning of Vision Transformer
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 488
RADAR: VQ-VAE Decoder of VAR is a Good Student for Restoring Against Degradation by Acceleration
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 489
Beyond Single Solution: Multi-Hypothesis Deep Unfolding Network for Image Compressive Sensing
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 490
FlashDecoder: Real-Time Latent-to-Pixel Streaming Decoder with Transformers
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 491
MambaSIC: Mamba-based Stereo Image Compression with Bi-directional Multi-reference Entropy Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 492
Neural Dynamic GI: Random-Access Neural Compression for Temporal Lightmaps in Dynamic Lighting Environments
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 493
Discovering Adaptive Task Dependencies for Efficient Multi-Task Representation Compression
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 494
OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 495
Perceptual Neural Video Compression with Color Separation and Rank Chain
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 496
Beyond Matching to Tiles: Bridging Unaligned Aerial and Satellite Views for Vision-Only UAV Navigation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 498
PiLoT: Neural Pixel-to-3D Registration for UAV-based Ego and Target Geo-localization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 499
PAUL: Uncertainty-Guided Partition and Augmentation for Robust Cross-View Geo-Localization under Noisy Correspondence
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 500
UniGeoRS: A Unified Benchmark for Tri-view Geo-Localization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 502
Watch and Learn: Learning to Use Computers from Online Videos
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 503
OneThinker: All-in-one Reasoning Model for Image and Video
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 504
Incentivizing Versatile Video Reasoning in MLLMs via Data-Efficient Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 505
Act2See: Emergent Active Visual Perception for Video Reasoning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 506
VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 507
ViLoMem: Agentic Learner with Grow-and-Refine Multimodal Semantic Memory
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 508
ReMoT: Reinforcement Learning with Motion Contrast Triplets
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 509
Incentivizing Generative Zero-Shot Learning via Outcome-Reward Reinforcement Learning with Visual Cues
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 510
Semantic-Guided Global-Local Collaborative Prompt Learning for Few-Shot Class Incremental Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 511
Beyond Heuristic Prompting: A Concept-Guided Bayesian Framework for Zero-Shot Image Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 512
One Patch to Caption Them All: A Unified Zero-Shot Captioning Framework
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 513
Data-Centric Meta-Learning for Robust Few-Shot Generalization
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 514
Bridging the Modality Gap in Compositional Zero-Shot Learning via Sparse Alignment and Unimodal Memory Bank
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 515
LIFT and PLACE: A Simple, Stable, and Effective Knowledge Distillation Framework for Lightweight Diffusion Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 517
Uncertainty-Aware Knowledge Distillation for Multimodal Large Language Models
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 518
Beyond Soft Label: Dataset Distillation via Orthogonal Gradient Matching
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 519
BHCast: Unlocking Black Hole Plasma Dynamics from a Single Blurry Image with Long-Term Forecasting
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 520
RawMetaDiff: Unlocking Extreme Darkness from Dual-Exposure RAW with Meta-Guided Diffusion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 521
Prospective Dynamic 3D MRI Reconstruction via Latent-Space Motion Tracking from Single Measurement
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 522
Lens Component Deletion based on Differentiable Ray Tracing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 523
X-band Radar Non-Line-of-Sight Imaging
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 524
3M-TI: High-Quality Mobile Thermal Imaging via Calibration-free Multi-Camera Cross-Modal Diffusion
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 525
UAVLight: A Benchmark for Illumination-Robust 3D Reconstruction in Unmanned Aerial Vehicle (UAV) Scenes
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 526
Polarization State Tracing for Reflection Removal and Color-Consistent Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 527
GFRRN: Explore the Gaps in Single Image Reflection Removal
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 528
Efficient All-Pairs Correlation Volume Sampling for Optical Flow Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 529
Cross-Slice Knowledge Transfer via Masked Multi-Modal Heterogeneous Graph Contrastive Learning for Spatial Gene Expression Inference
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 530
Adapting a Pre-trained Single-Cell Foundation Model to Spatial Gene Expression Generation from Histology Images
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 532
SO(3)-Equivariant ViT-Adapter for Data-Efficient Zero-Shot Sim-to-Real Indoor Panoramic Depth Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 533
Sparsity-Aware Voxel Attention and Foreground Modulation for 3D Semantic Scene Completion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 534
XPaintNet: An eXtreme Lightweight Framework for Stereoscopic Conversion without Inpainting Network
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 535
MD2E: Modeling Depth-to-Edge Cues for Monocular Metric Depth Estimation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 536
LiteSense: Lifting Lightweight ToF with RGB for High-Resolution Metric Depth Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 537
3D-Aware Multi-Task Learning with Cross-View Correlations for Dense Scene Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 538
The Midas Touch for Metric Depth
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 539
Lifting Unlabeled Internet-level Data for 3D Scene Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 540
ObjectMorpher: 3D-Aware Image Editing via Deformable 3DGS
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 542
MeshFlow: Efficient Artistic Mesh Generation via MeshVAE and Flow-based Diffusion Transformer
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 543
WonderZoom: Multi-Scale 3D World Generation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 544
SceneTok: A Compressed, Diffusable Token Space for 3D Scenes
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 545
PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 546
Extend3D: Town-Scale 3D Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 547
Pano3DComposer: Feed-Forward Compositional 3D Scene Generation from Single Panoramic Image
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 548
MeshWeaver: Sparse-Voxel-Guided Surface Weaving for Autoregressive Mesh Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 549
CaliTex: Geometry-Calibrated Attention for View-Coherent 3D Texture Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 550
CraftMesh: High-Fidelity Generative Mesh Manipulation via Poisson Seamless Fusion
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 551
LoG3D: Ultra-High-Resolution 3D Shape Modeling via Local-to-Global Partitioning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 552
MaskFocus: Focusing Policy Optimization on Critical Steps for Masked Image Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 553
Efficient Training for Human Video Generation with Entropy-Guided Prioritized Progressive Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 554
PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 555
GRPO-Guard: Mitigating Implicit Over-Optimization in Flow Matching via Regulated Clipping
[
Slides]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 556
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 557
Flash-DMD: Towards High-Fidelity Few-Step Image Generation with Efficient Distillation and Joint Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 558
VISTA: A Test-Time Self-Improving Video Generation Agent
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 559
Neighbor GRPO: Contrastive ODE Policy Optimization Aligns Flow Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 560
SMV-EAR: Bring Spatiotemporal Multi-View Representation Learning into Efficient Event-Based Action Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 561
Hierarchical Action Learning for Weakly-Supervised Action Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 562
Gamba: Mamba-based graph convolutional network with dynamic graph topology learning for action recognition
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 563
Beyond Binary Contrast: Modeling Continuous Skeleton Action Spaces with Transitional Anchors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 564
PRISM: Learning a Shared Primitive Space for Transferable Skeleton Action Representation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 565
TWEO: Transformers Without Extreme Outliers Enables FP8 Training And Quantization For Dummies
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 566
Unified Spherical Frontend: Learning Rotation-Equivariant Representations of Spherical Images from Any Camera
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 569
Towards Knowledge-augmented Bayesian Deep Learning For Computer Vision
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 570
NESTOR: A Nested MOE-based Neural Operator for Large-Scale PDE Pre-Training
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 571
Evidential Transformation Network: Turning Pretrained Models into Evidential Models for Post-hoc Uncertainty Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 572
Beyond Euclidean Gossip: KL-Barycentric Consensus on Heterogeneous and Imbalanced Images
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 574
Batch Loss Score for Dynamic Data Pruning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 575
Teacher-Guided Routing for Sparse Vision Mixture-of-Experts
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 576
WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 577
MangoBench: A Benchmark for Multi-Agent Goal-Conditioned Offline Reinforcement Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 578
iSHIFT: Lightweight Slow-Fast GUI Agent with Adaptive Perception
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 579
MMBench-GUI: A Unified Hierarchical Evaluation Framework for Multi-Platform GUI Agents
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 580
Boosting Vision-Language Models Towards Cross-Domain Incremental Object Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 581
UniSpector: Towards Universal Open-set Defect Recognition via Spectral-Contrastive Visual Prompting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 582
Unlearning without Forgetting: Securely Removing Targeted Concepts from Large-Scale Vision-Language Open-Vocabulary Detectors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 583
UNI-OOD: Unified Object- and Image-level Out-of-Distribution Detection via Cross-Context Attentive Vision-Language Modeling
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 584
S2C2Seg: Semantic-Spatial Consistency and Category Optimization for Open-Vocabulary Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 585
NoOVD: Novel Category Discovery and Embedding for Open-Vocabulary Object Detection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 586
The Missing Point in Vision Transformers for Universal Image Segmentation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 587
PromptMoE: A Segmentation Refinement Framework Leveraging Mixture of Experts for Improved Prompting
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 588
The Power of Prior: Training-Free Open-Vocabulary Semantic Segmentation with LLaVA
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 589
Beyond Text: Visual Description Assembly by Probabilistic Model for CLIP-based Weakly Supervised Semantic Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 590
High-Precision Dichotomous Image Segmentation via Depth Integrity-Prior and Fine-Grained Patch Strategy
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 591
GeoSAM2: Unleashing the Power of SAM2 for 3D Part Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 592
Material Magic Wand: Material-Aware Grouping of 3D Parts in Untextured Meshes
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 593
Synthetic Object Compositions for Scalable and Accurate Learning in Detection, Segmentation, and Grounding
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 594
Unlocking 3D Affordance Segmentation with 2D Semantic Knowledge
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 595
HySeg: Learning Generative Priors for Structure-Aware Remote Sensing Segmentation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 597
MMVIP: A Visible-infrared Paired Dataset for Multi-weather Marine Vision
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 598
Beyond Tie Points: Satellite Image Block Adjustment based on Dense Feature Consistency
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 599
Spectrally Distilled Representations Aligned with Instruction-Augmented LLMs for Satellite Imagery
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 600
Global Underwater Geolocation from Time-Lapse Polarization Imagery
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 601
Olbedo: An Albedo and Shading Aerial Dataset for Large-Scale Outdoor Environments
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 602
PRUE: A Practical Recipe for Field Boundary Segmentation at Scale
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 603
SARMAE: Masked Autoencoder for SAR Representation Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 604
LNEM: Lunar Neural Elevation Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 605
A Polarized Reflection and Material Dataset of Real World Objects
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 606
LaSM: Layer-wise Scaling Mechanism for Defending Pop-up Attack on GUI Agents
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 607
RaPA: Enhancing Transferable Targeted Attacks via Random Parameter Pruning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 608
All Vehicles Can Lie: Efficient Adversarial Defense in Fully Untrusted-Vehicle Collaborative Perception via Pseudo-Random Bayesian Inference
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 609
A Combination of Noise and Bilateral Filters Achieve Supralinear and Scalable Adversarial Robustness in CNNs
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 611
Write Where It Matters: Policy-Guided Watermarks for 3D Gaussian Splatting
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 612
Attack for Defense: Adversarial Agents for Point Prompt Optimization Empowering Segment Anything Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 613
RevINN: An End-to-End Invertible Neural Network for Reversible Adversarial Examples Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 614
CamPI: Physical Adversarial Examples through Camera Power Signal Injection
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 616
GraspALL: Adaptive Structural Compensation from Illumination Variation for Robotic Garment Grasping in Any Low-Light Conditions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 617
Opening the Sim-to-Real Door for Humanoid Pixel-to-Action Policy Transfer
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 618
Learning Cross-View Object Correspondence via Cycle-Consistent Mask Prediction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 619
RoboWheel: A Data Engine from Real-World Human Demonstrations for Cross-Embodiment Robotic Learning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 620
Chain of World: World Model Thinking in Latent Motion
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 621
Scalable Feature Matching via State Space Modeling and Sparse Correlation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 622
Video2Robo: 3DGS-based Synthetic Data from One Video Enables Scalable Robot Learning
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 623
ConsisVLA-4D: Advancing Spatiotemporal Consistency in Efficient 3D-Perception and 4D-Reasoning for Robotic Manipulation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 624
SRPO: Self-Referential Policy Optimization for Vision-Language-Action Models
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 625
GeoDexGrasp: Geometry-aware Generation for Data-efficient and Physics-plausible Dexterous Grasping
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 626
Lifelong Imitation Learning with Multimodal Latent Replay and Incremental Adjustment
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 627
From Observation to Action: Latent Action-based Primitive Segmentation for VLA Pre-training in Industrial Settings
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 628
AGiLe: Learning Robust Long-Horizon Manipulation via Affordance-Grounded Bidirectional Latent Planning
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 629
Language-Grounded Decoupled Action Representation for Robotic Manipulation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 630
Learning to Act Robustly with View-Invariant Latent Actions
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 631
ORBIT: Benchmarking SfM in the Wild with 360° Video
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 632
SpikeTrack: A Spike-driven Framework for Efficient Visual Tracking
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 633
Time Without Time: Pseudo-Temporal Representation for Space-Time Super-Resolution
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 634
Envisioning the Future, One Step at a Time
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 635
FlowFM: Advancing Dark Optical Flow Estimation with Flow Matching
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 636
Drift-Resilient Temporal Priors for Visual Tracking
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 637
An Efficient Token Compression Framework for Visual Object Tracking
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 638
No Labels, No Look-Ahead: Unsupervised Online Video Stabilization with Classical Priors
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 639
From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 640
Momentum Memory for Knowledge Distillation in Computational Pathology
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 641
Modeling the Brain’s Grammar: ROI-Guided fMRI Pretraining for Transferable and Interpretable Vision Decoding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 642
Joint Spectral Image Reconstruction and Semantic Segmentation with Cooperative Unfolding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 643
X-WIN: Building Chest Radiograph World Model via Predictive Sensing
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 644
fMRI-LM: Towards a Universal Foundation Model for Language-Aligned fMRI Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 645
Tell2Adapt: A Unified Framework for Source Free Unsupervised Domain Adaptation via Vision Foundation Model
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 646
TIM: Temporal Decoupling with Iterative Mutual-Refinement Model for Longitudinal Radiology Report Generation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 647
Ultrasound-CLIP: Semantic-Aware Contrastive Pre-training for Ultrasound Image-Text Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 649
BiGMINT: Biologically-guided Hierarchical Multimodal Integration for Modeling Multiple Compound Activities in Drug Discovery
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 650
Modeling Spatiotemporal Neural Frames for High Resolution Brain Dynamic
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 651
CMR-RD: Long-Tailed Adaptive VLM for Explainable CMR Diagnosis
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 652
Clinically-Grounded Counterfactual Reasoning for Medical Video Diagnosis
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 653
FBTA: Enabling Single-GPU End-to-End Gigapixel WSI Classification with Feature Bridging and Translation Alignment
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 654
Ultra Diffusion Poser: Diffusion-Based Human Motion Tracking from Sparse Inertial Sensors and Ranging-based Between-sensor Distances
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 655
Egocentric Visibility-Aware Human Pose Estimation
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 656
Shoe Style-Invariant and Ground-Aware Learning for Dense Foot Contact Estimation
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 657
OMG-Bench: A New Challenging Benchmark for Skeleton-based Online Micro Hand Gesture Recognition
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 658
Recovering Physically Plausible Human-Object Interactions from Monocular Videos
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 659
MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 660
TeHOR: Text-Guided 3D Human and Object Reconstruction with Textures
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 661
SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 662
CrossHOI: Learning Cross-View Representations for Monocular 3D Human-Object Interaction Reconstruction
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 663
Gaussian-Mixture Latent Flow for Stochastic 3D Human Motion Prediction
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 664
SGSoft: Learning Fused Semantic-Geometric Features for 3D Shape Correspondence via Template-Guided Soft Signals
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 665
Beyond Single-View Sufficiency: CVBench for Cross-View Human Understanding
[
Poster]
Poster
Fri Jun 05 09:45 AM -- 11:45 AM (PDT) @ ExHall A-F 666
Breaking Spurious Correlations: Uncertainty-Driven Causal Transformers for AU Detection
[
Poster]
Successful Page Load