Toggle Poster Visibility
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 1
Evidential Neural Radiance Fields
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 2
Global-Aware Edge Prioritization for Pose Graph Initialization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 3
Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 4
Optical Flow Matching: Reframing Optical Flow as Continuous Transport Dynamics
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 5
SEATrack: Simple, Efficient, and Adaptive Multimodal Tracker
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 6
U^2Flow: Uncertainty-Aware Unsupervised Optical Flow Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 7
AToken: A Unified Tokenizer for Vision
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 8
Confusion-Aware Spectral Regularizer for Long-Tailed Recognition
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 9
Learning Latent Concepts for Detecting Out-of-Distribution Objects
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 10
Learning Like Humans: Analogical Concept Learning for Generalized Category Discovery
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 11
Understanding and Enforcing Weight Disentanglement in Task Arithmetic
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 13
AT-VLA: Adaptive Tactile Injection for Enhanced Feedback Reaction in Vision-Language-Action Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 14
Learning Diffeomorphism for Medical Image Registration with Time-Embedded Architectures Using Semigroup Regularization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 15
QuadSync: Quadrifocal Tensor Synchronization via Tucker Decomposition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 16
SocialNav: Training Human-Inspired Foundation Model for Socially-Aware Embodied Navigation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 17
Structural Action Transformer for 3D Dexterous Manipulation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 18
TESO: Online Tracking of Essential Matrix by Stochastic Optimization
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 19
BoostSLT: Boosting Sign Language Translation via a Plug-and-Play Diffusion-Based Semantic Enhancer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 20
ImmerIris: A Large-Scale Dataset and Benchmark for Off-Axis and Unconstrained Iris Recognition in Immersive Applications
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 21
OLATverse: A Large-scale Real-world Object Dataset with Precise Lighting Control
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 22
OpenDance: Multimodal Controllable 3D Dance Generation with Large-scale Internet Data
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 23
POLAR: A Portrait OLAT Dataset and Generative Framework for Illumination-Aware Face Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 24
Relightable Holoported Characters: Capturing and Relighting Dynamic Human Performance from Sparse Views
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 25
Scaling View Synthesis Transformers
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 26
WildPose: A Unified Framework for Robust Pose Estimation in the Wild
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 27
MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 28
Revisiting Monocular SLAM with Spatio-Temporal Scene Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 29
Minimal Constraint Relaxation for Multiview Autocalibration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 30
Motion 3-to-4: 3D Motion Reconstruction for 4D Synthesis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 31
GGPT: Geometry-Grounded Point Transformer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 32
MERG3R: A Divide-and-Conquer Approach to Large-Scale Neural Visual Geometry
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 33
Unlocking the Power of Critical Factors for 3D Visual Geometry Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 34
KV-Tracker: Real-Time Pose Tracking with Transformers
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 35
InstructMix2Mix: Consistent Sparse-View Editing Through Multi-View Model Personalization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 36
From Rays to Projections: Better Inputs for Feed-Forward View Synthesis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 37
SLARM: Streaming and Language-Aligned Reconstruction Model for Dynamic Scenes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 38
Parallel Rigidity Matters for Bundle Adjustment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 39
Simple but Effective Triplet-Based Compression Strategies for Compact Visual Localization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 40
VIAFormer: Voxel-Image Alignment Transformer for High-Fidelity Voxel Refinement
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 41
Mining Attribute Subspaces for Efficient Fine-tuning of 3D Foundation Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 42
DualPrim: Compact 3D Reconstruction with Positive and Negative Primitives
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 43
StyleGallery: Training-free and Semantic-aware Personalized Style Transfer from Arbitrary Image References
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 44
DynFusion: Rethinking Condition Fusion for Adaptive Multi-Conditional Text-to-Image Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 45
Agentic Retoucher for Text-To-Image Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 46
StyleDoctor: Towards Specialist Reward Model for Style-centric Generation Tasks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 47
SwitchCraft: Training-Free Multi-Event Video Generation with Attention Controls
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 48
Premier: Personalized Preference Modulation with Learnable User Embedding in Text-to-Image Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 49
Paper2Figure: A Multi-Agent Collaborative System for Figure Generation Towards Academic Research Paper
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 50
Adapting In-context Generation for Enhanced Composed Image Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 51
Transition Models: Rethinking the Generative Learning Objective
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 52
Rethinking Glyph Spatial Information in Font Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 53
StreamDiT: Real-Time Streaming Text-to-Video Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 54
ChArtist: Generating Pictorial Charts with Unified Spatial and Subject Control
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 55
Camera Control for Text-to-Image Generation via Learning Viewpoint Tokens
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 56
3D Space as a Scratchpad for Editable Text-to-Image Generation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 57
Aligning Multi-Character Narrative Image Generation with Multi-Aspect Human Preferences
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 58
FoleyDirector: Directing Temporal Controllable Video-to-Audio Generation via Fine-Grained Temporal Scripts
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 59
DCoAR: Deep Concept Injection into Unified Autoregressive Models for Personalized Text-to-Image Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 60
DreamOmni2: Multimodal Instruction-based Generation and Editing
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 61
AutoDebias: An Automated Framework for Detecting and Mitigating Backdoor Biases in Text-to-Image Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 62
PosterIQ: A Design Perspective Benchmark for Poster Understanding and Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 63
IVAAN: Instance-level Vision-Language Alignment via Attribute-Guided Text Prompts Generation for Nuclei Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 64
IsoCLIP: Decomposing CLIP Projectors for Efficient Intra-modal Alignment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 65
TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 66
BioVITA: Biological Dataset, Model, and Benchmark for Visual-Textual-Acoustic Alignment
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 67
Boosting Visual Reprogramming for CLIP with Dual Granularity Alignment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 68
Decouple to Generalize: Context-First Self-Evolving Learning for Data-Scarce Vision-Language Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 69
UniGen-1.5: Enhancing Image Generation and Editing through Reward Unification in RL
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 70
PolySLGen: Online Multimodal Speaking-Listening Reaction Generation in Polyadic Interaction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 71
Label What Matters: Modality-Balanced and Difficulty-Aware Multimodal Active Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 72
Unified Personalized Understanding, Generating and Editing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 73
MSRL: Scaling Generative Multimodal Reward Modeling via Multi-Stage Reinforcement Learning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 74
Towards Uncertainty-aware Unsupervised Domain Adaptation for Videos and Time-Series with Causal Optimal Transport
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 75
Foundation Model Priors Enhance Object Focus in Feature Space for Source-Free Object Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 76
Decision Boundary-aware Generation for Long-tailed Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 77
Towards Stable Federated Continual Test-Time Adaptation in Wild World
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 78
HyCal: A Training-Free Prototype Calibration Method for Cross-Discipline Few-Shot Class-Incremental Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 79
ACE-Merging: Data-Free Model Merging with Adaptive Covariance Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 80
CHIPS: Efficient CLIP Adaptation via Curvature-aware Hybrid Influence-based Data Selection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 81
Addressing Exacerbated Attention Sink for Source-Free Cross-Domain Few-Shot Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 82
Depth Hypothesis Guided Iterative Refinement for Event–Image Monocular Depth Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 83
High-Quality and Efficient Turbulence Mitigation with Events
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 84
Tracking through Severe Occlusion via Event-Derived Transient Cues
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 85
FastEventDGS: Deformable Gaussian Splatting for Fast Dynamic Scenes from a Single Event Camera
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 86
Event-Based Motion Deblurring Using Task-Oriented 3D Gaussian Event Representations
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 87
From Corners to Fiducial Tags: Revisiting Checkerboard Calibration for Event Cameras
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 88
Extending Embodied Question Answering from Perception to Decision
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 89
Dejavu: Towards Experience Feedback Learning for Embodied Intelligence
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 90
Demo2Tutorial: From Human Experience to Multimodal Software Tutorials
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 91
MaskDexGrasp: Generative Masked Modeling for Part-Aware Dexterous Grasp Synthesis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 92
Predict Before You Explore: Predictive Planning with Specialized Memory for Embodied Question Answering
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 93
VideoWeaver: Multimodal Multi-View Video-to-Video Transfer for Embodied Agents
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 94
MindPower: Enabling Theory-of-Mind Reasoning in VLM-based Embodied Agents
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 95
Align While Search: Belief-Guided Exploratory Inference for World-Grounded Embodied Agents
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 96
Rethinking Intermediate Representation for VLM-based Robot Manipulation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 98
FantasyVLN: Unified Multimodal Chain-of-Thought Reasoning for Vision-and-Language Navigation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 99
UniLight: A Unified Representation for Lighting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 100
MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 101
Upsample Anything: A Simple and Hard to Beat Baseline for Feature Upsampling
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 102
Hist2Style: Histogram-Guided Stylization with Bilateral Grids
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 103
Harmonic Canvas: Inversion-Free Editing for Visually-Guided Music Style Transfer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 104
How to Take a Memorable Picture? Empowering Users with Actionable Feedback
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 105
UniEdit-I: Training-free Image Editing for Unified VLM via Iterative Understanding, Editing and Verifying
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 106
SCIEval: Evaluating and Benchmarking the Faithfulness of Scientific Image Generation and Interpretation with Large Multimodal Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 107
GeoRelight: Learning Joint Geometrical Reconstruction and Relighting with Flexible Multi-Modal Diffusion Transformers
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 108
HAD: Hallucination-Aware Diffusion Priors for 3D Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 109
Catalyst4D: High-Fidelity 3D-to-4D Scene Editing via Dynamic Propagation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 110
ReFlow: Self-correction Motion Learning for Dynamic Scene Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 111
Semantic Foam: Unifying Spatial and Semantic Scene Decomposition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 112
NVGS: Neural Visibility for Occlusion Culling in 3D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 113
NeAR: Coupled Neural Asset–Renderer Stack
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 114
Thermal is Always Wild: Characterizing and Addressing Challenges in Thermal-Only Novel View Synthesis
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 115
PhysGM: Large Physical Gaussian Model for Feed-Forward 4D Synthesis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 116
Life-IQA: Boosting Blind Image Quality Assessment through GCN-enhanced Layer Interaction and MoE-based Feature Decoupling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 117
TM-BSN: Triangular-Masked Blind-Spot Network for Real-World Self-Supervised Image Denoising
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 118
Multinex: Lightweight Low-light Image Enhancement via Multi-prior Retinex
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 119
Beyond Ground-Truth: Leveraging Image Quality Priors for Real-World Image Restoration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 120
ExpoCM: Exposure-Aware One-Step Generative Single-Image HDR Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 121
Physically-Grounded Turbulence Mitigation with Frame-Shared Degradation Parameters
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 122
Convexity-Aware Noise Calibration: A Self-Supervised Framework for Noise-Level-Unknown Image Denoising
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 123
UCMNet: Uncertainty-Aware Context Memory Network for Under-Display Camera Image Restoration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 124
Beyond the Ground Truth: Enhanced Supervision for Image Restoration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 125
ShiftLUT: Spatial Shift Enhanced Look-Up Tables for Efficient Image Restoration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 126
Bilevel Layer-Positioning LoRA for Real Image Dehazing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 127
SD-FSMIS: Adapting Stable Diffusion for Few-Shot Medical Image Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 129
SHAPE: Structure-aware Hierarchical Unsupervised Domain Adaptation with Plausibility Evaluation for Medical Image Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 130
Delving Aleatoric Uncertainty in Medical Image Segmentation via Vision Foundation Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 131
Revisiting 2D Foundation Models for Scalable 3D Medical Image Classification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 133
Simple-ViLMedSAM: Simple Text Prompts Meet Vision-Language Models for Medical Image Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 134
NeuroSeg Meets DINOv3: Transferring 2D Self-Supervised Visual Priors to 3D Neuron Segmentation via DINOv3 Initialization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 135
Multi-Paradigm Collaborative Adversarial Attack Against Multi-Modal Large Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 136
TINA: Text-Free Inversion Attack for Unlearned Text-to-Image Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 137
Jailbreaking Vision-Language Models via Dissonance-Guided Suffix Optimization and Image–Phrase Injection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 138
BlackMirror: Black-Box Backdoor Detection for Text-to-Image Models via Instruction-Response Deviation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 139
VCP-Attack: Visual-Contrastive Projection for Transferable Black-Box Targeted Attacks on Large Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 140
Adapter Shield: A Unified Framework with Built-in Authentication for Preventing Unauthorized Zero-Shot Image-to-Image Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 141
LLaVAShield: Safeguarding Multimodal Multi-Turn Dialogues in Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 142
Transform to Transfer: Boosting Adversarial Attack Transferability on Vision-Language Pre-training Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 143
Mask to Align, Weight to Disambiguate: Reliable Unsupervised Cross-Modal Hashing with Masked-Weight Contrast
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 144
Reliable Clustering Number Estimation for Contrastive Multi-View Clustering
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 145
Pushing the Frontier of Audiovisual Perception with Large-Scale Multimodal Correspondence Learning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 146
Enhance-then-Balance Modality Collaboration for Robust Multimodal Sentiment Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 147
SonoWorld: From One Image to a 3D Audio-Visual Scene
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 149
EXOTIC: External Vision-driven Incomplete Multi-view Classification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 150
Easy2Hard: From Partially to Fully Unmatched Modalities as Negative Samples in Contrastive Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 151
OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 152
BALM: A Model-Agnostic Framework for Balanced Multimodal Learning under Imbalanced Missing Rates
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 153
UniT: Unified Multimodal Chain-of-Thought Test-time Scaling
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 154
Multi-modal Test-time Adaptation via Adaptive Probabilistic Gaussian Calibration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 155
Information-Theoretic Decomposition for Multimodal Interaction Learning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 156
Is the Modality Gap a Bug or a Feature? A Robustness Perspective
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 157
Omni-Fake: Benchmarking Unified Multimodal Social Media Deepfake Detection
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 158
MUST: Modality-Specific Representation-Aware Transformer for Diffusion-Enhanced Survival Prediction with Missing Modality
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 159
VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 160
MOS: Mitigating Optical-SAR Modality Gap for Cross-Modal Ship Re-Identification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 161
SeD-UD: An Influence-Driven and Hierarchically-Decoupled Information Bottleneck for Multimodal Intent Recognition
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 162
MultiModalPFN: Extending Prior-Data Fitted Networks for Multimodal Tabular Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 163
LacTokGen: Latent Consistency Tokenizer for 1024-pixel Image Generation by 256 Tokens
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 164
FlowSteer: Guiding Few-Step Image Synthesis with Authentic Trajectories
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 165
Visual Autoregressive Modeling via Next Focus Prediction
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 166
Semantic Context Matters: Improving Conditioning for Autoregressive Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 167
TempoMaster: Efficient Long Video Generation via Next-Frame-Rate Prediction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 168
FlashIn: Fast and Accurate Image Inversion for Real-time Image Editing
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 169
EasyV2V: A High-quality Instruction-based Video Editing Framework
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 170
One Algorithm to Align Them All
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 171
VGA-Bench: A Unified Benchmark and Multi-Model Framework for Video Aesthetics and Generation Quality Evaluation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 172
Improved Mean Flows: On the Challenges of Fastforward Generative Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 173
SynMotion: Semantic-Visual Adaptation for Motion Customized Video Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 174
Match-and-Fuse: Consistent Generation from Unstructured Image Sets
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 175
Mixture of Style Experts for Diverse Image Stylization
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 176
Mirai: Autoregressive Visual Generation Needs Foresight
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 177
Align Images Before You Generate
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 178
Bridging the Perception Gap in Image Super-Resolution Evaluation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 179
Time-Aware One Step Diffusion Network for Real-World Image Super-Resolution
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 180
Restore Text First, Enhance Image Later: Two-Stage Scene Text Image Super-Resolution with Glyph Structure Guidance
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 181
IAFMNet: Information-Aware Feature Modulation for Efficient Super-Resolution
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 182
Physics-Consistent Diffusion for Efficient Fluid Super-Resolution via Multiscale Residual Correction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 183
Bridging Fidelity-Reality with Controllable One-Step Diffusion for Image Super-Resolution
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 184
Omni-Supervised Motion Editing: Balancing Change and Invariance through Positive-Negative Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 185
FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 186
Cross-Axis Feature Fusion with Joint-Wise Motion Difference Prediction for Text-Based 3D Human Motion Editing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 187
MotionMaster: Generalizable Text-Driven Motion Generation and Editing
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 188
OpenT2M: No-frill Motion Generation with Open-source, Large-scale, High-quality Data
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 189
Towards Decompositional Human Motion Generation with Energy-Based Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 190
PAMotion: Physics-Aware Motion Generation for Full-Body Interaction with Multiple Objects
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 191
Sketch2Colab: Sketch-Conditioned Multi-Human Animation via Controllable Flow Distillation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 192
ViHOI: Human-Object Interaction Synthesis with Visual Priors
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 193
CLEP: Contrastive Language-Pose Pretraining
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 194
OpenFS: Multi-Hand-Capable Fingerspelling Recognition with Implicit Signing-Hand Detection and Frame-Wise Letter-Conditioned Synthesis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 195
ARMFlow: AutoRegressive MeanFlow for Online 3D Human Reaction Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 196
InterPhys: Physics-aware Human Motion Synthesis in a Dynamic Scene
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 197
Beyond Mimicry: Learning Whole-Body Human-Humanoid Interaction from Human-Human Demonstrations
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 198
PHAC: Promptable Human Amodal Completion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 199
CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 200
IntrinsicWeather: Controllable Weather Editing in Intrinsic Space
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 201
Outlier-Robust Diffusion Solvers for Inverse Problems
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 202
Beyond Fixed Formulas: Data-Driven Linear Predictor for Efficient Diffusion Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 204
Diff-SemiER: Transparency-Aware Adaptive Fusion Diffusion Model with Generative Prior for Semi-Transparent Eyeglasses Removal
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 205
KLIP: Localized Distribution Shift Detection via KL-Divergence with Diffusion Priors in Inverse Problems
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 206
Elucidating the Design Space of Arbitrary-Noise-Based Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 207
Taming Generative Diffusion Model for Task-Oriented Infrared Imaging
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 208
Attention, May I Have Your Decision? Localizing Generative Choices in Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 209
RxnCaption: Reformulating Reaction Diagram Parsing as Visual Prompt Guided Captioning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 210
More than the Sum: Panorama-Language Models for Adverse Omni-Scenes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 211
DiGraphHal-Bench: Evaluating Multimodal Large Language Models on Complex Directed Graphs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 212
SEA-Vision: A Multilingual Benchmark for Comprehensive Document and Scene Text Understanding in Southeast Asia
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 214
Spot The Ball: A Benchmark for Visual Social Inference
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 216
E-comIQ-ZH: A Human-Aligned Dataset and Benchmark for Fine-Grained Evaluation of E-commerce Posters with Chain-of-Thought
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 217
GeoWorld: Geometric World Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 218
ORD: Object-Relation Decoupling for Generalized 3D Visual Grounding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 219
Benchmarking PhD-Level Coding in 3D Geometric Computer Vision
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 220
MonoVLM: Monocular 3D Visual Grounding with Vision Language Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 221
Curvature-Aware Captioning: Leveraging Geodesic Attention for 3D Scene Understanding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 222
SPREAD: Spatial-Physical REasoning via geometry Aware Diffusion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 223
ExtrinSplat: Decoupling Geometry and Semantics for Open-Vocabulary Understanding in 3D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 224
SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 225
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 226
VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 227
Merge3D: Efficient 3D Multimodal LLMs via Joint 2D-3D Token Merging
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 228
Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 229
LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 230
Quota-Calibrated Fine-Grained Alignment with Context-Aware Marginals for Text-based Person Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 231
Evo-Retriever: LLM-Guided Curriculum Evolution with Viewpoint-Pathway Collaboration for Multimodal Document Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 232
Taxonomy-Aware Representation Alignment for Hierarchical Visual Recognition with Large Multimodal Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 233
FAAR: Efficient Frequency-Aware Multi-Task Fine-Tuning via Automatic Rank Selection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 234
Model Merging in the Essential Subspace
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 235
Beyond Semantic Search: Towards Referential Anchoring in Composed Image Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 236
SAVE: Speech-Aware Video Representation Learning for Video-Text Retrieval
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 237
MarkushGrapher-2: End-to-end Multimodal Recognition of Chemical Structures
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 239
EthoCLIP: Ontology-Enhanced Video-Language Pretraining for Animal Behavior Understanding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 240
TrajTok: Learning Trajectory Tokens Enhances Video Understanding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 241
Streaming Video Instruction Tuning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 242
VidPrism: Heterogeneous Mixture of Experts for Image-to-Video Transfer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 244
From Static to Dynamic: Exploring Self-supervised Image-to-Video Representation Transfer Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 245
Learnable Motion-Focused Tokenization for Effective and Efficient Video Unsupervised Domain Adaptation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 246
FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 247
Learning Transferable Temporal Primitives for Video Reasoning via Synthetic Videos
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 248
Video Panels for Long Video Understanding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 249
Gaze Target Estimation Anywhere with Concepts
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 250
Select, Hypothesize and Verify: Towards Verified Neuron Concept Interpretation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 251
Finding Distributed Object-Centric Properties in Self-Supervised Transformers
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 253
See Through the Noise: Improving Domain Generalization in Gaze Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 254
Mechanisms of Object Localization in Vision–Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 255
mmWaveFlow: Unified Enhancement and Generation of mmWave Human Point Clouds
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 256
From Feature Learning to Spectral Basis Learning: A Unifying and Flexible Framework for Efficient and Robust Shape Matching
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 257
Topology-aware Feature Propagation for Unsupervised Non-rigid Point Cloud Correspondence
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 258
BEV-SLD: Self-Supervised Scene Landmark Detection for Global Localization with LiDAR Bird’s-Eye View Images
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 259
SAG-GNN: Semantic-Aware Guided GNN for Descriptor-Free 2D-3D Matching
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 260
LiREC-Net: A Target-Free and Learning-Based Network for LiDAR, RGB, and Event Calibration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 261
GM-R^2: Generative Matching Learning for Unsupervised Geometric Representation and Registration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 262
4D Local Modeling Toward Dynamic Global Perception for Ambiguity-free Rotation-Invariant Point Cloud Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 263
PointNSP: Autoregressive 3D Point Cloud Generation with Next-Scale Level-of-Detail Prediction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 264
MORE-STEM: Long-Short MemOry REcall and Spatio-TEmporal Consistency Model for Query-Driven 3D/4D Point Cloud Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 265
Low-Rank Test-Time Training for Pre-Trained Point Cloud Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 266
STAR: Test-Time Adaptation Can Enhance Universal Prompt Learning for Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 267
Exploring Visual Pretraining for Learning Language Intelligence
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 268
VL-Eraser: Vacuum Distillation for Machine Unlearning in Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 269
DeAR: Fine-Grained VLM Adaptation by Decomposing Attention Head Roles
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 270
SynCLIP: Synonym-Coherent Language-Image Pretraining for Robust Open-Vocabulary Dense Perception
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 271
MODIX: A Training-Free Multimodal Information-Driven Positional Index Scaling for Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 272
VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 273
ORION: ORthonormal Text Encoding for Universal VLM AdaptatION
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 274
CASPA: Graph-Structured Concept Anchors for Modality-Agnostic Adaptation in Vision–Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 276
HOG-Layout: Hierarchical 3D Scene Generation, Optimization and Editing via Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 277
Towards Human-Like Robot Handwriting via Contour-Aware Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 278
MajutsuCity: Language-driven Aesthetic-adaptive City Generation with Controllable 3D Assets and Layouts
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 279
VectorArk: Learning Practical Image Vectorization with Rounded Polygon Representation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 280
OctoT2I: A Self-Evolving Agentic Text-to-Image Router
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 281
LottieGPT: Tokenizing Vector Animation for Autoregressive Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 282
SEA: Evaluating Sketch Abstraction Efficiency via Element-level Commonsense Visual Question Answering
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 283
Selective Amnesia using Contrastive Subnet Erasure for Class Level Unlearning in Vision Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 284
A Closed-Form Solution for Debiasing Vision-Language Models with Utility Guarantees Across Modalities and Tasks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 285
Rank-Guided Pseudo-Bias Learning for Robust Black-Box Adaptation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 286
Diagnosing and Repairing Unsafe Channels in Vision-Language Models via Causal Discovery and Dual-Modal Safety Subspace Projection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 287
WaTeRFlow: Watermark Temporal Robustness via Flow Consistency
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 288
DSO: Direct Steering Optimization for Bias Mitigation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 290
SineProject: Machine Unlearning for Stable Vision-Language Alignment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 291
HiLoRA: Hierarchical Low-Rank Adaptation for Personalized Federated Learning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 292
OS-Fed: One Snapshot Is All You Need
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 293
FedAlign: Differentially Private Distribution Alignment for Non-IID Federated Learning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 294
Guiding Diffusion Models with Fine-Grained Conditions and Semantics-Preserving Sampling for One-Shot Federated Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 295
Personalized Federated Training of Diffusion Models with Privacy Guarantees
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 297
Understanding Temporal Logic Consistency in Video-Language Models through Cross-Modal Attention Discriminability
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 298
Small Object, Great Challenge: A Benchmark for Small Object Visual Grounding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 299
UFVideo: Towards Unified Fine-Grained Video Cooperative Understanding with Large Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 300
ReMoRa: Multimodal Large Language Model based on Refined Motion Representation for Long-Video Understanding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 301
CaST-Bench: Benchmarking Causal Chain-Grounded Spatio-Temporal Reasoning for Video Question Answering
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 302
HERO: Hierarchical Embedding-Refinement for Open-Vocabulary Temporal Sentence Grounding in Videos
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 303
Scaling the Long Video Understanding of Multimodal Large Language Models via Visual Memory Mechanism
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 304
Hybrid Token Compression for Vision-Language Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 305
Focus, Don’t Prune: Identifying Instruction-Relevant Regions for Information-Rich Image Understanding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 306
When Token Pruning is Worse than Random: Understanding Visual Token Information in VLLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 307
VISion On Request: Enhanced VLLM efficiency with sparse, dynamically selected, vision-language interactions
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 308
BiGain: Unified Token Compression for Joint Generation and Classification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 309
Hi-Lo Prune: Look at What You'll Lose before Pruning with Hierarchical Token Selection
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 310
VLM-Pruner: Buffering for Spatial Sparsity in an Efficient VLM Centrifugal Token Pruning Paradigm
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 311
Bridge: Basis-Driven Causal Inference Marries VFMs for Domain Generalization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 312
In Pursuit of Pixel Supervision for Visual Pre-training
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 313
GaussianMatch: Semi-Supervised Regression with Pseudo-Label Filtering via Multi-View Gaussian Consistency
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 314
TAR: Token-Aware Refinement for Fine-grained Generalized Category Discovery
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 315
Semantic Noise Reduction via Teacher-Guided Dual-Path Audio-Visual Representation Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 316
The Universal Normal Embedding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 317
Bypassing the Transport Plan: Dynamic Reweighting for Out-of-Distribution Detection with Optimal Transport
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 318
Cross-domain Dual-stream Feature Disentanglement for Brain Disorder Prediction with Sparsely Labeled PET
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 319
Debiased Sample Selection for Learning with Noisy Labels
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 321
Open-Ended Instruction Realization with LLM-Enabled Multi-Planner Scheduling in Autonomous Vehicles
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 322
EE-RL: Vision Language Guided Reinforcement Learning with Explorer and Expert model for End-to-End Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 323
Sensor2Sensor: Cross-Embodiment Sensor Conversion for Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 324
SHARP: Short-Window Streaming for Accurate and Robust Prediction in Motion Forecasting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 325
DriveCombo: Benchmarking Compositional Traffic Rule Reasoning in Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 326
CausalVAD: De-confounding End-to-End Autonomous Driving via Causal Intervention
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 328
Learning to Drive is a Free Gift: Large-Scale Label-Free Autonomy Pretraining from Unposed In-The-Wild Videos
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 329
WhisperNet: A Scalable Solution for Bandwidth-Efficient Collaboration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 330
Efficient Equivariant Transformer for Self-Driving Agent Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 331
Generalizable Co-Salient Object Detection via Mixed Content-Style Modulation
[
Slides]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 332
Saliency-Driven Token Merging for Vision Transformers
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 333
RISE: Single Static Radar-based Indoor Scene Understanding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 334
Mixture-of-Experts based Feature Decoupling for Open Vocabulary Scene Graph Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 335
TF-SSD: A Strong Pipeline via Synergic Mask Filter for Training-free Co-salient Object Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 336
Denoise and Align: Towards Source-Free UDA for Robust Panoramic Semantic Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 337
SPOT: Spatiotemporal Prompt Optimization for Motion-Stabilized MLLM-Guided Video Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 338
Changes in Real Time: Online Scene Change Detection with Multi-View Fusion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 339
Subspace Alignment for CLIP-based Continual Learning via Canonical Correlation Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 340
DGS: Dual Gradient and Semantic-Shift Guided Low-Rank Adaptation for Class Incremental Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 341
Dynamic Magic: Unleashing Restricted Knowledge for Lifelong Person Re-Identification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 342
Which Concepts to Forget and How to Refuse? Decomposing Concepts for Continual Unlearning in Large Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 343
Temporal Imbalance of Positive and Negative Supervision in Class-Incremental Learning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 344
Forging a Dynamic Memory: Retrieval-Guided Continual Learning for Generalist Medical Foundation Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 345
Dance Across Shifts: Forward-Facilitation Continual Test-Time Adaptation through Dynamic Style Bridging
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 346
Few-Shot Hybrid Incremental Learning: Continually Learning under Data Scarcity and Task Uncertainty
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 347
High-Fidelity Mobile Avatars with Pruned Local Blendshapes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 348
PhysSkin: Real-Time and Generalizable Physics-Based Animation via Self-Supervised Neural Skinning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 349
Bridging Privacy and Provenance: Traceable Virtual Identity Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 350
PortraitDirector: A Hierarchical Disentanglement Framework for Controllable and Real-time Facial Reenactment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 351
Dynamic Label Noise Suppression with Optimal Teacher Pool for Facial Expression Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 352
MimicTalker: A Multimodal Interactive and Memory-Enhanced Framework for Real-Time Dyadic 3D Head Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 353
DecoVLN: Decoupling Observation, Reasoning, and Correction for Vision-and-Language Navigation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 354
HybridDriveVLA: Vision-Language-Action Model with Visual CoT reasoning and ToT Evaluation for Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 355
NavForesee: A Unified Vision-Language World Model for Hierarchical Planning and Dual-Horizon Navigation Prediction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 356
LookasideVLN: Direction-Aware Aerial Vision-and-Language Navigation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 359
FreeForm: Reduced-Order Deformable Simulation from Particle-Based Skinning Eigenmodes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 360
GeoDiff4D: Geometry-Aware Diffusion for 4D Head Avatar Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 361
4DEquine: Disentangling Motion and Appearance for 4D Equine Reconstruction from Monocular Video
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 362
PhysHO: Physics-Based Dynamic 3D Gaussian Human and Object from Monocular Video
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 363
ProgressiveAvatars: Progressive Animatable 3D Gaussian Avatars
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 364
ZINA: Multimodal Fine-grained Hallucination Detection and Editing
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 365
Mitigating Multimodal Hallucinations via Gradient-based Self-Reflection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 366
HalluGen: Synthesizing Realistic and Controllable Hallucinations for Evaluating Image Restoration
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 367
KVSmooth: Mitigating Hallucination in Multi-modal Large Language Models through Key-Value Smoothing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 368
ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Video Understanding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 369
Tell Model Where to Look: Mitigating Hallucinations in MLLMs by Vision-Guided Attention
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 370
Circular-DPO: Aligning Multi-Stage 3D Generative Models via Preference Feedback Loop
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 371
Cloning Deterministic Worlds: The Critical Role of Latent Geometry in Long-Horizon World Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 372
PrITTI: Primitive-based Generation of Controllable and Editable 3D Semantic Urban Scenes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 373
CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 374
ExPose: Reinforcing Video Generation Models for Extreme Pose Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 375
Choreographing a World of Dynamic Objects
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 376
SounDiT: Geo-Contextual Soundscape-to-Landscape Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 377
Vista4D: Video Reshooting with 4D Point Clouds
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 378
CamDirector: Towards Long-Term Coherent Video Trajectory Editing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 379
Elastic3D: Controllable Stereo Video Conversion with Guided Latent Decoding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 380
Decoupling Bias, Aligning Distributions: Synergistic Fairness Optimization for Deepfake Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 381
Target-Aware Invertible Encoder with Reconstruction Guidance for Infrared Small Target Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 382
BDNet:Bio-Inspired Dual-Backbone Small Object Detection Network
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 383
ElasticFormer: Detecting Objects in HRW Shots via Elastic Computing Vision Transformer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 384
RGB-Event based Pedestrian Attribute Recognition: A Benchmark Dataset and An Asymmetric RWKV Fusion Framework
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 385
FusionAgent: A Multimodal Agent with Dynamic Model Selection for Human Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 386
Free-Grained Hierarchical Visual Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 387
URICA: A Uniformity Region Affine Identifier Capture Algorithm for Arbitrary Region Retrieval in Pathology Images
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 389
DetAny4D: Detect Anything 4D Temporally in a Streaming RGB Video
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 390
Follow the Saliency: Supervised Saliency for Retrieval-augmented Dense Video Captioning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 391
Video-CoE: Reinforcing Video Event Prediction via Chain of Events
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 392
VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 393
VRR-QA: Visual Relational Reasoning in Videos Beyond Explicit Cues
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 394
Question-guided Visual Compression with Memory Feedback for Long-Term Video Understanding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 395
CURVE: A Benchmark for Cultural and Multilingual Long Video Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 396
SVBench: Evaluation of Video Generation Models on Social Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 397
Hierarchical Long Video Understanding with Audiovisual Entity Cohesion and Agentic Search
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 398
LifeEval: A Multimodal Benchmark for Assistive AI in Egocentric Daily Life Tasks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 399
Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 401
YOSE: You Only Select Essential Tokens for Efficient DiT-based Video Object Removal
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 402
CADC: Content Adaptive Diffusion-Based Generative Image Compression
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 403
FG-Portrait: 3D Flow Guided Editable Portrait Animation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 404
ResCa: Residual Caching for Diffusion Transformers Acceleration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 405
IP-Adapter Is All You Need: Towards Fine-Tuning-Free Diffusion-Based Talking Face Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 406
SRA 2: Variational Autoencoder Self-Representation Alignment for Efficient Diffusion Training
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 407
InnoAds-Composer: Efficient Condition Composition for E-Commerce Poster Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 409
SODA: Sensitivity-Oriented Dynamic Acceleration for Diffusion Transformer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 410
DSERT-RoLL: Robust Multi-Modal Perception for Diverse Driving Conditions with Stereo Event-RGB-Thermal Cameras, 4D Radar, and Dual-LiDAR
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 412
ReManNet: A Riemannian Manifold Network for Monocular 3D Lane Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 413
PanDA: Unsupervised Domain Adaptation for Multimodal 3D Panoptic Segmentation in Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 414
STUR3D: Spatio-Temporal Unified Representation Learning for 3D Object Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 415
Exploring 6D Object Pose Estimation with Deformation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 416
SearchAD: Large-Scale Rare Image Retrieval Dataset for Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 417
Improving Vision-language Models with Perception-centric Process Reward Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 418
X-PCR: A Benchmark for Cross-modality Progressive Clinical Reasoning in Ophthalmic Diagnosis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 420
PhysInOne: Visual Physics Learning and Reasoning in One Suite
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 421
AviaSafe: A Physics-Informed Data-Driven Model for Aviation Safety–Critical Cloud Forecasts
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 422
TTRV: Test-Time Reinforcement Learning for Vision Language Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 423
Reading or Reasoning? Format Decoupled Reinforcement Learning for Document OCR
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 424
QUANTIPHY: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 425
VisRes Bench: On Evaluating the Visual Reasoning Capabilities of VLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 426
TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 427
Urban-GS: A Unified 3D Gaussian Splatting Framework for Compact and High-Fidelity Aerial-to-Street Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 428
Generalizable Sparse-View 3D Reconstruction from Unconstrained Images
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 429
RemedyGS: Defend 3D Gaussian Splatting Against Computation Cost Attacks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 430
SparseCam4D: Spatio-Temporally Consistent 4D Reconstruction from Sparse Cameras
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 431
IDESplat: Iterative Depth Probability Estimation for Generalizable 3D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 432
GS^2: Graph-based Spatial Distribution Optimization for Compact 3D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 433
OnlinePG: Online Open-Vocabulary Panoptic Mapping with 3D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 434
Uni3R: Unified 3D Reconstruction and Semantic Understanding via Generalizable Gaussian Splatting from Unposed Multi-View Images
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 435
Learning Explicit Continuous Motion Representation for Dynamic Gaussian Splatting from Monocular Videos
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 436
MLLMSplat: A 2D MLLM-Powered Framework for 3D Gaussian Splatting Understanding, Generation, and Editing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 437
Dropping Anchor and Spherical Harmonics for Sparse-view Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 438
RAP: Fast Feedforward Rendering-Free Attribute-Guided Primitive Importance Score Prediction for Efficient 3D Gaussian Splatting Processing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 439
Plug-and-Play PDE Optimization for 3D Gaussian Splatting: Toward High-Quality Rendering and Reconstruction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 440
PointGS: Semantic-Consistent Unsupervised 3D Point Cloud Segmentation with 3D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 442
Flow4DGS-SLAM: Optical Flow-Guided 4D Gaussian Splatting SLAM
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 443
Revisiting 3D Reconstruction Kernels as Low-Pass Filters
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 444
SR3R: Rethinking Super-Resolution 3D Reconstruction With Feed-Forward Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 445
GP-4DGS: Probabilistic 4D Gaussian Splatting from Monocular Video via Variational Gaussian Processes
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 446
VisRef: Visual Refocusing while Thinking Improves Test-Time Scaling in Multi-Modal Large Reasoning Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 447
IPR-1: Interactive Physical Reasoner
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 448
VIRO: Robust and Efficient Neuro-Symbolic Reasoning with Verification for Referring Expression Comprehension
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 450
Thinking in Dynamics: How Multimodal Large Language Models Perceive, Track, and Reason Dynamics in Physical 4D World
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 451
Latent Implicit Visual Reasoning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 452
Thinking with Programming Vision: Towards a Unified View for Thinking with Images
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 453
AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 454
All Roads Lead to Rome: Incentivizing Divergent Thinking in Vision-Language Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 455
See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 456
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 457
ReaGEN: Adaptive Generation of Structured Chains-of-Thought for Efficient Multimodal Reasoning
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 458
Breaking the Regional Perception Bottleneck of Multimodal Large Language Models via External Reasoning Framework
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 459
CodePercept: Code-Grounded Visual STEM Perception for MLLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 460
TableMix: Enhancing Multimodal Table Reasoning in MLLMs from a Data-Centric Perspective
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 461
Harnessing Chain-of-Thought Reasoning in Multimodal Large Language Models for Face Anti-Spoofing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 462
Grounded Chain-of-Thought for Multimodal Large Language Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 464
SegMo: Co-Designing Content-Aware Sparsity and Locally-Cohesive Segment Parallelism for Efficient VLM Inference
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 466
Compressed-Domain-Aware Online Video Super-Resolution
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 468
Is Bin Generation Indispensable? A Bin-Generation-Free Dataset Quantization via Semantic Perspective
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 469
High Resolution Neural Video Coding with Bi-directional Confidence-Guided Reference Information Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 470
Distributed Image Compression with Multimodal Side Information at Extremely Low Bitrates
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 471
Task-Aware Image Signal Processor for Advanced Visual Perception
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 472
Enhancing Video Vision Language Model with Hippocampal Sensing
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 473
VIRD: View-Invariant Representation through Dual-Axis Transformation for Cross-View Pose Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 475
SoPE: Spherical Coordinate-Based Positional Embedding for Enhancing Spatial Perception of 3D LVLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 476
RHO: Robust Holistic OSM-Based Metric Cross-View Geo-Localization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 477
EfficientVPR: Toward Efficient Visual Place Recognition via Scene-Aware Prompt Tuning and Adaptive Feature Enhancement
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 478
Universal Guideline-Driven Image Clustering via a Hybrid LLM Agent
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 480
VideoChat-M1: Collaborative Policy Planning for Video Understanding via Multi-Agent Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 481
Think, Then Verify: A Hypothesis–Verification Multi-Agent Framework for Long Video Understanding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 482
Reinforce to Learn, Elect to Reason: A Dual Paradigm for Video Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 483
Graph-to-Frame RAG: Visual-Space Knowledge Fusion for Training-Free and Auditable Video Reasoning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 485
Multi-Modal Image Fusion via Intervention-Stable Feature Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 486
ReCoFuse: Ultra-Robust Image Fusion via Restorative Multi-Modal Diffusion Reciprocal Coupling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 487
Degradation-Robust Fusion: An Efficient Degradation-Aware Diffusion Framework for Multimodal Image Fusion in Arbitrary Degradation Scenarios
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 488
DF^2-VB: Dual-level Fuzzy Fusion with View-specific Boosting for Multi-view Multi-label Classification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 489
UniFusion: A Unified Image Fusion Framework with Robust Representation and Source-Aware Preservation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 490
Self-guided Semantic Inspection for Zero-Shot Composed Image Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 491
G-MIXER: Geodesic Mixup-based Implicit Semantic Expansion and Explicit Semantic Re-ranking for Zero-Shot Composed Image Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 492
No Hard Negatives Required: Concept Centric Learning Leads to Compositionality without Degrading Zero-shot Capabilities of Contrastive Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 493
MUSE: Harnessing Precise and Diverse Semantics for Few-Shot Whole Slide Image Classification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 494
Pointing at Parts: Training-Free Few-Shot Grounding in Multimodal LLMs
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 495
Graph Attention Prototypical Network for Robust Few-Shot Classification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 496
Mitigating The Distribution Shift of Diffusion-based Dataset Distillation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 497
EVLF: Early Vision-Language Fusion for Generative Dataset Distillation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 498
Fixed Anchors Are Not Enough: Dynamic Retrieval and Persistent Homology for Dataset Distillation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 499
Flow Map Distillation Without Data
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 500
F^2HDR: Two-Stage HDR Video Reconstruction via Flow Adapter and Physical Motion Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 501
Learning Latent Transmission and Glare Maps for Lens Veiling Glare Removal
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 502
Inter-Photon-Limited Videography
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 503
A Bit is All You Need! Efficient Video Capture via Single Bit Imaging
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 504
From Events to Clarity: The Event-Guided Diffusion Framework for Dehazing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 505
Electromagnetic Inverse Scattering from a Single Transmitter
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 506
Statistical Characteristic-Guided Denoising for Rapid High-Resolution Transmission Electron Microscopy Imaging
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 507
Physics-Guided Multistep Deformation Reversal for Ancient Bamboo Slip Restoration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 508
cryoSENSE: Compressive Sensing Enables High-throughput Microscopy with Sparse and Generative Priors on the Protein Cryo-EM Image Manifold
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 509
SGDE: Self-supervised Geometry Degradation Estimation Framework for Coded Aperture Compressive Spectral Imaging
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 510
Factorized Context Aggregation for Robust Cancer Risk Estimation via Soft Re-Ranked Retrieval and Hierarchical Anchors
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 511
UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 513
Depth Any Endoscopy: Towards Self-Supervised Generalizable Depth Estimation in Monocular Endoscopy
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 514
RoSAMDepth: Robust Self-supervised Depth Estimation Leveraging Segment Anything Model
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 515
AdaSFormer: Adaptive Serialized Transformers for Monocular Semantic Scene Completion from Indoor Environments
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 516
Dark3R: Learning Structure from Motion in the Dark
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 517
What Makes Good Synthetic Training Data for Zero-Shot Stereo Matching?
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 518
TR2M: Transferring Monocular Relative Depth to Metric Depth with Language Descriptions and Dual-Level Scale-Oriented Contrast
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 519
Iris: Integrating Language into Diffusion-based Monocular Depth Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 520
Ov3R: Open-Vocabulary Semantic 3D Reconstruction from RGB Videos
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 521
M3DLayout: A Multi-Source Dataset of 3D Indoor Layouts and Structured Descriptions for 3D Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 522
UniPart: Part-Level 3D Generation with Unified 3D Geom–Seg Latents
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 523
Photo3D: Advancing Photorealistic 3D Generation through Structure‑Aligned Detail Enhancement
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 524
Mesh-Pro: Asynchronous Advantage-guided Ranking Preference Optimization for Artist-style Quadrilateral Mesh Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 525
Order Matters: 3D Shape Generation from Sequential VR Sketches
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 526
Think-Then-Generate: Structural Chain-of-Thought Reasoning for Consistent 3D Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 527
ArtLLM: Generating Articulated Assets via 3D LLM
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 528
PoseMaster: A Unified 3D Native Framework for Stylized Pose Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 529
2D-LFM: Lifting Foundation Model without 3D Supervision
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 530
ActionMesh: Animated 3D Mesh Generation with Temporal 3D Diffusion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 531
4DWorldBench: A Comprehensive Evaluation Framework for 3D/4D World Generation Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 532
FabricGen: Microstructure-Aware Woven Fabric Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 533
Leveraging Verifier-Based Reinforcement Learning in Image Editing
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 534
PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 535
VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 536
MapReduce LoRA: Advancing the Pareto Front in Multi-Preference Optimization for Generative Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 537
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 538
C^2FG: Control Classifier-Free Guidance via Score Discrepancy Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 539
Learning What to Trust: Bayesian Prior-Guided Optimization for Visual Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 540
Unified Customized Generation by Disentangled Reward Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 541
Region-Aware Instance Consistency Learning for Micro-Expression Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 542
MPL: Match-guided Prototype Learning for Few-shot Action Recognition
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 543
LaDy: Lagrangian-Dynamic Informed Network for Skeleton-based Action Segmentation via Spatial-Temporal Modulation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 544
LA-Pose: Latent Action Pretraining Meets Pose Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 545
RAAS: LLM Agentic System Architecture Search with GRPO
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 546
Temporal Representation Enhancement (TRE): Learning to Forget Dominant Patterns for Enhanced Temporal Spiking Features
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 547
Chain-of-Models Pre-Training: Rethinking Training Acceleration of Vision Foundation Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 548
Unlocking Pre-trained Weights: Parameter Inheritance for Zero-Shot Initialization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 549
Deconstructing the Failure of Ideal Noise Correction: A Three-Pillar Diagnosis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 550
Progressive Neural Architecture Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 551
A Unified Framework for Knowledge Transfer in Bidirectional Model Scaling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 552
When Do Models Actually Decide? Mapping the Layer-Wise Decision Timeline in Pretrained Neural Networks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 553
Temporal Interaction in Spiking Transformers with Multi-Delay Mixer
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 554
Consensus vs. Controversy: Mapping the Decision Space Where Architectures Diverge
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 555
Sparsely Timing the Change: A Spiking Temporal Framework for Remote Sensing Interpretation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 556
ProSoftArena: Benchmarking Hierarchical Capabilities of Multi-modal Agents in Professional Software Environments
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 557
BAMI: Training-Free Bias Mitigation in GUI Grounding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 558
DRS-GUI: Dynamic Region Search for Training-Free GUI Grounding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 559
Consistency Beyond Contrast: Enhancing Open-Vocabulary Object Detection Robustness via Contextual Consistency Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 560
Thermal-Det: Language-Guided Cross-Modal Distillation for Open-Vocabulary Thermal Object Detection
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 561
Geometry-driven OOD Detectors Are Class-Incremental Learners
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 562
Mind the Way You Select Negative Texts: Pursuing the Distance Consistency in OOD Detection with VLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 563
Prompt-Free Unknown Label Generation for Open World Detection in Remote Sensing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 564
Learning to Diversify and Focus: A Reinforcement Framework for Open-Vocabulary HOI Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 565
RINO: Rotation-Invariant Non-Rigid Correspondences
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 566
Hyperbolic Prototype Learning with Uncertainty-Aware Consistency for Continual Test-Time Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 567
DINO Eats CLIP: Adapting Beyond Knowns for Open-set 3D Object Retrieval
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 568
Leveraging Class Distributions in CLIP for Weakly Supervised Semantic Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 569
CompetitorFormer: Mitigating Query Conflicts for 3D Instance Segmentation via Competitive Strategy
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 570
D2Dewarp: Dual Dimensions Geometric Representation Learning Based Document Image Dewarping
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 571
Discover, Segment, and Select: A Progressive Mechanism for Zero-shot Camouflaged Object Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 572
D-Convexity: A Unified Differentiable Convex Shape Prior via Quasi-Concavity for Data-driven Image Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 573
Fast Reasoning Segmentation for Images and Videos
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 574
Structure-Aware Representation Distillation for Tiny-Dense Object Segmentation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 575
CRFT: Consistent–Recurrent Feature Flow Transformer for Cross-Modal Image Registration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 576
FireScope: Wildfire Risk Raster Prediction With a Chain-of-Thought Oracle
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 577
OlmoEarth: Stable Latent Image Modeling for Multimodal Earth Observation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 578
TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 579
Regulating Rather than Constraining: Adaptive Guidance for Complex Spectral Reconstruction in Pansharpening
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 580
GeoMMBench and GeoMMAgent: Toward Expert-Level Multimodal Intelligence in Geoscience and Remote Sensing
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 581
Revisiting the Necessity of Full Accuracy: Weakly Supervised Object-Level Offset Correction for Misaligned Building Labels
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 582
UniGeoSeg: Towards Unified Open-World Segmentation for Geospatial Scenes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 583
ZoomEarth: Active Perception for Ultra-High-Resolution Geospatial Vision-Language Tasks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 584
Unleashing Stealthy Backdoor Pandemic by Infecting a Single Diffusion Model
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 585
Taming the Long Tail: Rebalancing Adversarial Training via Adaptive Perturbation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 586
Robustness Under Data Scarcity: Few-Shot Continual Adversarial Training for Evolving Threats
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 587
Logit-Margin Repulsion for Backdoor Defense
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 588
Thermally Activated Dual-Modal Adversarial Clothing against AI Surveillance Systems
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 589
Immunizing Models Against Harmful Long-Horizon Fine-Tuning via Contractive Optimization Dynamics
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 590
Towards Stealthy and Effective Backdoor Attacks on Lane Detection: A Naturalistic Data Poisoning Approach
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 591
Red-teaming Retrieval-Augmented Diffusion Models via Poisoning Knowledge Bases
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 592
Latent Diffusion Inversion Requires Understanding the Latent Space
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 593
Fractal Camouflage: A Bio-Inspired Approach for Multi-Scale Adversarial Attacks in the Infrared Domain
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 594
EgoRoC: Towards Egocentric Robotic Control via Task-Agnostic Visual Alignment
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 595
Describe Anything Anywhere At Any Moment
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 596
StaMo: Unsupervised Learning of Generalizable Robot Motion from Compact State Representation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 597
VLA Models Are More Generalizable Than You Think: Revisiting Physical and Spatial Modeling
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 598
Action–Geometry Prediction with 3D Geometric Prior for Bimanual Manipulation
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 599
Joint-Aligned Latent Action: Towards Scalable VLA Pretraining in the Wild
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 600
Rethinking Camera Choice: An Empirical Study on Fisheye Camera Properties in Robotic Manipulation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 601
INSIGHT Bench: Towards Grounded IN-SItu Guidance for Robotic ManipulaTion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 602
MM-ACT: Learn from Multimodal Parallel Generation to Act
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 603
HQC-NBV: A Hybrid Quantum-Classical View Planning Approach
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 604
Motus: A Unified Latent Action World Model
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 605
SE(3)-Equivariance with Geometric and Topological Guidance for Category-Level Object Pose Estimation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 606
SPEAR-1: Scaling Beyond Robot Demonstrations via 3D Understanding
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 607
Global Prior Meets Local Consistency: Dual-Memory Augmented Vision-Language-Action Model for Efficient Robotic Manipulation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 608
RoboTAG: End-to-end Robot Pose Estimation via Topological Alignment Graph
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 609
MVLM: Template-Free Tracking via Vision–Language Margin Confidence and Memory-Gated Tracking
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 610
Interactive Tracking: A Human-in-the-Loop Paradigm with Memory-Augmented Adaptation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 611
VidEoMT: Your ViT is Secretly Also a Video Segmentation Model
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 612
Matching Every Pair to Track Every Point: PairFormer for All-Pairs Tracking and Video Trajectory Fields
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 613
Boosting Self-Supervised Tracking with Contextual Prompts and Noise Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 614
Progressive Multi-cue Alignment for Unaligned RGBT Tracking
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 615
Real-Time Neural Video Compression with Unified Intra and Inter Coding
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 616
Adapting Lightweight Image-based Counting Models for Video Crowd Counting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 618
MedTVT-R1: A Multimodal LLM Empowering Medical Reasoning and Diagnosis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 619
MedKCO: Medical Vision-Language Pretraining via Knowledge-Driven Cognitive Orchestration
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 620
Toward Generalizable Whole Brain Representations with High-Resolution Light-Sheet Data
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 621
CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworks
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 622
GenTract: Generative Global Tractography
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 623
LUMINA: A Multi-Vendor Mammography Benchmark with Energy Harmonization Protocol
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 624
Virtual Immunohistochemistry Staining with Dual-Aligned Multi-Task Feature Guidance
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 625
Can Natural Image Autoencoders Compactly Tokenize fMRI Volumes for Long-Range Dynamics Modeling?
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 626
IEBGL:An Interpretability-Enhanced Brain Graph Learning Framework with LLM-Instructed Topology and Literature-Augmented Semantics
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 627
F^2-Assist: Multi-Phase Fetal Growth Forecast and Report Generation from Ultrasound Examination
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 628
Sparse Spectral LoRA: Routed Experts for Medical VLMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 629
SAT-RRG: LLM-Guided Self-Adaptive Training for Radiology Report Generation with Token-Level Push–Pull Optimization
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 630
OralGPT-Plus: Learning to Use Visual Tools via Reinforcement Learning for Panoramic X-ray Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 631
Structural–Semantic Perception for Diffusion-Guided Temporal Forgery Localization
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 632
Forensic-Friendly Image Manipulation via Controllable Latent Diffusion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 633
IncreFA: Breaking the Static Wall of Generative Model Attribution
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 634
AVFakeBench: A Comprehensive Audio-Video Forgery Detection Benchmark for AV-LMMs
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 635
Detecting Compressed AI-Generated Images via Phase Spectrum Robustness
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 636
Detect Any AI-Counterfeited Text Image
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 637
DeepfakeImpact: A Two-Stage Benchmark with Real-World Impact in Deepfake Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 638
Enhancing the Security of Visual Speaker Authentication Based on Dynamic Lip-Print Analysis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 639
SimLBR: Learning to Detect Fake Images by Learning to Detect Real Images
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 640
Editprint: General Digital Image Forensics via Editing Fingerprint with Self-Augmentation Training
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 641
Detecting AI-Generated Forgeries via Iterative Manifold Deviation Amplification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 642
Goldilocks Test Sets for Face Verification
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 643
Fine-VAD: Towards Fine-Grained Video Anomaly Detection via Progressive Cross-Granularity Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 644
DLVP-CLIP: Enhancing Fine-Grained Zero-Shot Anomaly Detection via Dynamic Local Visual Prompting
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 645
MoECLIP: Patch-Specialized Experts for Zero-shot Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 646
Alert-CLIP: Abnormality-aware Latent-Enhanced Representation Tuning of CLIP for Video Anomaly Detection
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 647
AnomalyVFM -- Transforming Vision Foundation Models into Zero-Shot Anomaly Detectors
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 649
Bidirectional Multimodal Prompt Learning with Scale-Aware Training for Few-Shot Multi-Class Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 650
GS-CLIP: Zero-shot 3D Anomaly Detection by Geometry-Aware Prompt and Synergistic View Representation Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 651
TLMA: Mitigating the Impact of Weakly Labeled Information for Video Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 652
Defect Cue-Preserved Structural Feature Refinement for Few-Shot Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 653
Anomaly-Related Residual Fields for Cross-domain Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 654
From Attraction to Equilibrium: Physics-Inspired Semantic Gravitons for Zero-Shot Anomaly Detection
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 655
Joint Learning of General and Diverse Patterns with Mixture of Memory Experts for Weakly-Supervised Video Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 656
No Need For Real Anomaly: MLLM Empowered Zero-Shot Video Anomaly Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 657
FB-CLIP: Fine-Grained Zero-Shot Anomaly Detection with Foreground-Background Disentanglement
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 658
DynamicVGGT: Learning Dynamic Point Maps for 4D Scene Reconstruction in Autonomous Driving
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 659
GenieDrive: Towards Physics-Aware Driving World Model with 4D Occupancy Guided Video Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 660
Test-Time 3D Occupancy Prediction
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 661
Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 663
dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 664
RegionRoute: Regional Style Transfer with Diffusion Model
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 665
Low-Rank Residual Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 666
RDF-MIG: A Robust Diffusion Framework for Masked Image Generation to Augment Semantic Segmentation and Change Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 667
TC-Padé: Trajectory-Consistent Padé Approximation for Diffusion Acceleration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 668
Bi-directional Autoregressive Diffusion for Large Complex Motion Interpolation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 669
Guiding Token-Sparse Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 670
Accelerating Diffusion-based Video Editing via Heterogeneous Caching: Beyond Full Computing at Sampled Denoising Timestep
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 672
High-Fidelity Virtual Try-On beyond Paired Data Scarcity via Diffusion-based Cycle-Consistent Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 673
Sampling-Aware Quantization for Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 674
CRAFT: Aligning Diffusion Models with Fine-Tuning Is Easier Than You Think
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 675
Scale Space Diffusion
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 676
Making Training-Free Diffusion Segmentors Scale with the Generative Power
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 677
Roots Beneath the Cut: Uncovering the Risk of Concept Recovery in Pruning-Based Unlearning for Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 678
Few-Step Diffusion Sampling Through Instance-Aware Discretizations
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 679
SpeeDiff: Scalable Pixel-Anchored End-to-End Latent Diffusion Model
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 680
Structure-to-Intensity Diffusion for Adverse-Weather LiDAR Generation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 681
Focal–General Diffusion Model with Semantic Consistent Guidance for Sign Language Production
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 682
Diffusion Probe: Generated Image Result Prediction Using CNN Probes
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 683
Content-Aware Dynamic Patchification for Efficient Video Diffusion
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 684
PixelRush: Ultra-Fast, Training-Free High-Resolution Image Generation via One-step Diffusion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 685
Diffusion-Based sRGB Real Noise Generation via Prompt-Driven Noise Representation Learning
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 686
Decoupled Residual Denoising Diffusion Models for Unified and Data Efficient Image-to-Image Translation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 687
GROW: Watermark Generation with Progressive Guidance for Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 688
MotionV2V: Editing Motion in a Video
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 689
Mind the Generative Details: Direct Localized Detail Preference Optimization for Video Diffusion Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 690
OrthoFuse: Training-free Riemannian Fusion of Orthogonal Style-Concept Adapters for Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 691
DreamStyle: A Unified Framework for Video Stylization
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 692
Diffusion Sampling Path Tells More: An Efficient Plug-and-Play Strategy for Sample Filtering
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 693
Designing Instance-Level Sampling Schedules via REINFORCE with James-Stein Shrinkage
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 694
Reward Sharpness-Aware Fine-Tuning for Diffusion Models
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 695
DBMSolver: A Training-free Diffusion Bridge Sampler for High-Quality Image-to-Image Translation
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 696
Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 697
TAP: A Token-Adaptive Predictor Framework for Training-Free Diffusion Acceleration
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 698
Cross-modal Representation Learning for Diffusion-generated Image Detection
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 699
Sparse-LaViDa: Sparse Multimodal Discrete Diffusion Language Models
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 700
Back to Basics: Let Denoising Generative Models Denoise
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 701
CaricHarmony: Contrastive Diffusion Paths for Identity-Preserving Caricature Synthesis
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 702
DiP: Taming Diffusion Models in Pixel Space
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 703
RAPID: Reusing Attention Sparsity with Inter-step Adaptation for Efficient Video Diffusion
[
Poster]
Poster
Sun Jun 07 10:45 AM -- 12:45 PM (PDT) @ ExHall F 704
Efficient and Training-Free Single-Image Diffusion Models
[
Poster]
Successful Page Load