Toggle Poster Visibility
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 1
Differentiable Laplacian Matrix Guided Superpixel Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 2
FILTR: Extracting Topological Features from Pretrained 3D Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 3
Learning Convex Decomposition via Feature Fields
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 4
Learning Eigenstructures of Unstructured Data Manifolds
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 6
CineBrain: A Large-Scale Multi-Modal Audiovisual Brain Dataset for Brain-Conditioned Video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 7
Hearing the Room Through the Shape of the Drum: Modal-Guided Sound Recovery from Multi-Point Surface Vibrations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 8
SDTrack: A Baseline for Event-based Tracking via Spiking Neural Networks
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 9
Thinking with Drafts: Speculative Temporal Reasoning for Efficient Long Video Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 10
Wan-Weaver: Interleaved Multi-modal Generation via Decoupled Training
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 11
CURE: Curriculum-guided Multi-task Training for Reliable Anatomy Grounded Report Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 12
DK-DDIL: Adaptive Knowledge Retention for Dynamic Domain-Incremental Learning in Medical Imaging
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 13
Dual-level Adapter Boosting Prompt-free Curvilinear Structure Segmentation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 14
LATA: Laplacian-Assisted Transductive Adaptation for Conformal Uncertainty in Medical VLMs
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 15
Medic-AD: Towards Medical Vision-Language Model's Clinical Intelligence
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 16
SegMoTE: Token-Level Mixture of Experts for Medical Image Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 17
Efficient Unrolled Networks for Large-Scale 3D Inverse Problems
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 18
FedAdamom: Adaptive Momentum for Improved Generalization in Federated Optimization
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 19
SimScale: Learning to Drive via Real-World Simulation at Scale
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 20
Texvent: Asynchronous Event Data Simulation via Text Prompt
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 21
WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 22
BuildingGPT: Auto-Regressive Building Wireframe Reconstruction Model with Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 23
Emergent Extreme-View Geometry in 3D Foundation Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 24
LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 25
LASER: Layer-wise Scale Alignment for Training-Free Streaming 4D Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 26
PanoVGGT: Feed-Forward 3D Reconstruction from Panoramic Imagery
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 27
Rascene: High-Fidelity 3D Scene Imaging with mmWave Communication Signals
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 28
VGG-T^3: Offline Feed-Forward 3D Reconstruction at Scale
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 30
OmniVGGT: Omni-Modality Driven Visual Geometry Grounded Transformer
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 32
HeSS: Head Sensitivity Score for Sparsity Redistribution in VGGT
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 33
Dense Metric Depth Completion from Sparse Direct Time-of-Flight Sensors
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 34
Online3R: Online Learning for Consistent Sequential Reconstruction Based on Geometry Foundation Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 35
Neu-PiG: Neural Preconditioned Grids for Fast Dynamic Surface Reconstruction on Long Sequences
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 36
Learning 3D Reconstruction with Priors in Test Time
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 37
ArchSym: Detecting 3D-Grounded Architectural Symmetries in the Wild
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 38
PointTPA: Dynamic Network Parameter Adaptation for 3D Scene Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 39
tttLRM: Test-Time Training for Long Context and Autoregressive 3D Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 40
Hint2Gen: Bridging Understanding and Generation via Code-structured Hints
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 41
Compositional Text-to-Image Generation Via Region-aware Bimodal Direct Preference Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 42
Learning by Analogy: A Causal Framework for Compositional Generalization
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 43
ID-Crafter: VLM-Grounded Online RL for Compositional Multi-Subject Video Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 44
GenColorBench: A Color Evaluation Benchmark for Text-to-Image Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 45
Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 46
When Pretty Isn’t Useful: Investigating Why Modern Text-to-Image Models Fail as Reliable Training Data Generators
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 47
TempoControl: Temporal Attention Guidance for Text-to-Video Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 48
Hear What Matters! Text-conditioned Selective Video-to-Audio Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 49
MultiCrafter: High-Fidelity Multi-Subject Generation via Disentangled Attention and Identity-Aware Preference Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 51
DiffGraph: An Automated Agent-driven Model Merging Framework for In-the-Wild Text-to-Image Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 52
Gloria: Consistent Character Video Generation via Content Anchors
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 53
DreamShot: Personalized Storyboard Synthesis with Video Diffusion Prior
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 54
M4V: Multimodal Mamba for Efficient Text-to-Video Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 55
Property-Informed Diffusion-Based Text-to-Microstructure Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 56
DreamingComics: A Story Visualization Pipeline via Subject and Layout Customized Generation using Video Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 57
Mixture of States: Routing Token-Level Dynamics for Multimodal Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 58
HiCoGen: Hierarchical Compositional Text-to-Image Generation in Diffusion Models via Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 59
TherA: Thermal-Aware Visual-Language Prompting for Controllable RGB-to-Thermal Infrared Translation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 60
See What I Mean: Aligning Vision and Language Representations for Video Fine-grained Object Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 61
CoV-Align: Efficient Fine-grained Cross-Modal Alignment with Cohesive Visual Semantics Priority
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 62
TDATR: Improving End-to-End Table Recognition via Table Detail-Aware Learning and Cell-Level Visual Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 63
A Mixed Diet Makes DINO An Omnivorous Vision Encoder
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 64
Uncertainty-guided Compositional Alignment with Part-to-Whole Semantic Representativeness in Hyperbolic Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 65
TaskForce: Cooperative Multi-agent Reinforcement Learning for Multi-task Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 66
PhyCritic: Multimodal Critic Models for Physical AI
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 67
R-C2: Cycle-Consistent Reinforcement Learning Improves Multimodal Reasoning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 68
Multimodal RewardBench 2: Evaluating Omni Reward Models for Interleaved Text and Image
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 69
Unified Generation and Self-Verification for Vision-Language Models via Advantage Decoupled Preference Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 70
Anchoring the Mind of Multimodal Reasoners: Cognitive Bias as a Vector for Jailbreak Attacks
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 71
InsCal: Calibrated Multi-Source Fully Test-Time Prompt Tuning for Object Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 72
Why Not Hyperparameter-Friendly Optimisation? A Monotonic Adaptive Norm Rescaling Approach For Long-Tailed Recognition
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 73
Decoupling Vision and Language: Codebook Anchored Visual Adaptation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 74
MemFlow: A Lightweight Forward Memorizing Framework for Quick Domain Adaptive Feature Mapping
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 75
Mind the Discriminability Trap in Source-Free Cross-domain Few-shot Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 76
Vision-Language Model Guided Source-Free Domain Adaptation via Optimal Transport
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 77
Masked Representation Modeling for Domain-Adaptive Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 78
TaskIT: Memory-Efficient Fine-Tuning of Multi-LoRA LLMs via Cross-Task Importance Transfer
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 79
ARES: Unifying Asymmetric RGB-Event Stereo for Probabilistic Scene Flow Estimation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 80
MER-Tracker: Towards High-Speed 3D Point Tracking via Multi-View Event-RGB Hybrid Cameras
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 81
Moving Border Ownership for Event-based Motion Segmentation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 82
TTAPFormer: Robust Arbitrary Point Tracking via Transient Asynchronous Fusion of Frames and Events
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 83
EventHub: Data Factory for Generalizable Event-Based Stereo Networks without Active Sensors
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 84
Seeing Motion Through Polarity for Event-based Action Recognition
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 85
Multi-Scale Gaussian-Language Map for Zero-shot Embodied Navigation and Reasoning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 86
Explore with Long-term Memory: A Benchmark and Multimodal LLM-based Reinforcement Learning Framework for Embodied Exploration
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 87
SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 88
TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 89
AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 90
Experience Transfer for Multimodal LLM Agents in Minecraft Game
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 91
MSGNav: Unleashing the Power of Multi-modal 3D Scene Graph for Zero-Shot Embodied Navigation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 92
SaPaVe: Towards Active Perception and Manipulation in Vision-Language Action Models for Robotics
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 93
MANSION: Multi-floor lANguage-to-3D Scene generatIOn for loNg-horizon tasks
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 94
RealAppiance: Let High-fidelity Appliance Assets Controllable and Workable as Aligned Real Manauls
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 95
ForeAct: Steering Your VLA with Efficient Visual Foresight Planning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 96
Affordance Field Intervention: Enabling VLAs to Escape Memory Traps in Robotic Manipulation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 97
MERIT: Multi-domain Efficient RAW Image Translation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 98
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 99
Probabilistic Prompt Adaptation for Unified Image Aesthetics and Quality Assessment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 100
EMMA: Concept Erasure Benchmark with Comprehensive Semantic Metrics and Diverse Categories
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 101
Too Vivid to Be Real? Benchmarking and Calibrating Generative Color Fidelity
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 102
WiseEdit: Benchmarking Cognition- and Creativity-Informed Image Editing
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 103
UnicEdit-10M: A Dataset and Benchmark Breaking the Scale-Quality Barrier via Unified Verification for Reasoning-Enriched Edits
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 104
Inter-Edit: First Benchmark for Interactive Instruction-Based Image Editing
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 105
PR-IQA: Partial-Reference Image Quality Assessment for Diffusion-Based Novel View Synthesis
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 106
LumiMotion: Improving Gaussian Relighting with Scene Dynamics
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 107
Let it Snow! Animating 3D Gaussian Scenes with Dynamic Weather Effects via Physics-Guided Score Distillation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 108
iLRM: An Iterative Large 3D Reconstruction Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 109
MVInverse: Feed-forward Multiview Inverse Rendering in Seconds
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 110
From None to All: Self-Supervised 3D Reconstruction via Novel View Synthesis
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 111
MoRel: Long-Range Flicker-Free 4D Motion Modeling via Anchor Relay-based Bidirectioanl Blending with Hierarchical Densification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 112
Multi-view Pyramid Transformer: Look Coarser to See Broader
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 113
CaT-GS: Efficient 3DGS Rendering for Large Scale Scenes via Inter-frame Caching and Tile Scheduling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 114
RL‑ScanIQA: Reinforcement-Learned Scanpaths for Blind 360° Image Quality Assessment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 115
Benchmarking Endoscopic Surgical Image Restoration and Beyond
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 116
SDUIE: Semi-Supervised Diffusion for Underwater Image Enhancement with Quant-Text Dual Control
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 117
HiDRA: Hierarchical Degradation Representation and Adaptation with Generative Priors for Enhancing Infrared Vision
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 118
BluRef: Unsupervised Image Deblurring with Dense-Matching References
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 119
Bi-Bridge: Bidirectional Diffusion Bridges for Low-Light Image Enhancement
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 120
UniLDiff: Unlocking the Power of Diffusion Priors for All-in-One Image Restoration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 121
MatAnyone 2: Scaling Video Matting via a Learned Quality Evaluator
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 122
SelfHVD: Self-Supervised Handheld Video Deblurring
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 123
Spatio-Temporal Difference Guided Motion Deblurring with the Complementary Vision Sensor
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 124
Learning Where to Look and How to Judge: Resolution-agnostic Image Quality Assessment with Quality-aware Saliency
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 125
Bridging RGB and Hematoxylin Components: An Interleaved Guidance and Fusion Framework for Point Supervised Nuclei Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 126
Virtual Nodes Guided Dynamic Graph Neural Network for Brain Tumor Segmentation with Missing Modalities
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 127
VoxTell: Free-Text Promptable Universal 3D Medical Image Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 128
Photo-Guided Tooth Segmentation on 3D Oral Scan Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 129
Breaking the Continuum: Discrete Distribution Learning for Structural MRI Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 130
Uni-Hema: Unified Model for Digital Hematopathology
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 131
Post-training Feature Pruning for Fundus Images Classification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 132
Sketch2CT: Multimodal Diffusion for Structure-Aware 3D Medical Volume Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 133
SafeLogo: Turning Your Logos into Jailbreak Shields via Micro-Regional Adversarial Training
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 134
Anti-I2V: Safeguarding your Photos from Malicious Image-to-video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 135
UniGame: Turning a Unified Multimodal Model Into Its Own Adversary
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 136
Hierarchically Robust Zero-shot Vision-language Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 137
Beyond Text Prompts: Precise Concept Erasure through Text–Image Collaboration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 138
AGENTSAFE: Benchmarking the Safety of Embodied Agents on Hazardous Instructions
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 139
ReMoE: Region-Mixture Experts for Adversarially-Robust Vision Transformers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 140
TreeTeaming: Autonomous Red-Teaming of Vision-Language Models via Hierarchical Strategy Exploration
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 141
SO-Bench: A Structural Output Evaluation of Multimodal LLM
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 142
Chain-of-Thought Guided Multi-Modal Object Re-Identification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 143
When Lines Meet Textures: Spatial-Frequency Aligned Diffusion Features for Cross-Sparsity Correspondence
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 144
CountGD++: Generalized Prompting for Open-World Counting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 145
AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 146
Parameter-Efficient Adaptation for MLLMs via Implicit Modality Decomposition
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 147
Hyperbolic Gramian Volumes for Multimodal Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 148
Venus: Benchmarking and Empowering Multimodal Large Language Models for Aesthetic Guidance and Cropping
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 149
AutoCut: End-to-end advertisement video editing based on multimodal discretization and controllable generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 150
StableMTL: Repurposing Latent Diffusion Models for Multi-Task Learning from Partially Annotated Synthetic Datasets
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 151
CaReFlow: Cyclic Adaptive Rectified Flow for Multimodal Fusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 152
Lenses: Toward Polysemous Vision–Language Understanding
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 153
CoRiM: Conflict-driven Risk Minimization for Dynamic Multimodal Fusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 154
Uncertainty-Aware Exploratory Direct Preference Optimization for Multimodal Large Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 155
CICA: Coupling Confidence-Aware Pretraining with Confidence-Informed Attention for Robust Multimodal Sentiment Analysis
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 156
SAMTok: Representing Any Mask with Two Words
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 157
Multi-Metric Representation Learning Strategy Based on Clustering for Fine-Grained Multimodal Sentiment Analysis
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 159
MMSD3.0: A Multi-Image Benchmark for Real-World Multimodal Sarcasm Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 160
Anchor-Guided Gradient Alignment for Incomplete Multimodal Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 161
PyraTok: Language-Aligned Pyramidal Tokenizer for Video Understanding and Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 162
VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 163
Kontinuous Kontext: Continuous Strength Control for Instruction-based Image Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 164
VideoCoF: Unified Video Editing with Temporal Reasoner
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 165
Progressive Supernet Training for Efficient Visual Autoregressive Modeling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 166
CoT-Edit: Let CoT Guide Instruction Video Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 168
Test-Time Instance-Specific Parameter Composition: A New Paradigm for Adaptive Generative Modeling
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 169
Understanding, Accelerating, and Improving MeanFlow Training
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 170
Meta-CoT: Enhancing Granularity and Generalization in Image Editing
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 171
Dual-Granularity Memory for Efficient Video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 172
Unified Camera Positional Encoding for Controlled Video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 173
EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 175
PLACID: Identity-Preserving Multi-Object Compositing via Video Diffusion with Synthetic Trajectories
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 176
Object-WIPER: Training-Free Object and Associated Effect Removal in Videos
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 177
Mobile-VTON: High-Fidelity On-Device Virtual Try-On
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 178
Progress by Pieces: Test-Time Scaling for Autoregressive Image Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 179
Towards Robust Sequential Decomposition for Complex Image Editing
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 180
Layer Consistency Matters: Elegant Latent Transition Discrepancy for Generalizable Synthetic Image Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 181
Chain of Event-Centric Causal Thought for Physically Plausible Video Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 182
LoL: Longer than Longer, Scaling Video Generation to Hour
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 183
FlowMotion: Training-Free Flow Guidance for Video Motion Transfer
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 184
Learning Straight Flows: Variational Flow Matching for Efficient Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 185
SIGMA: Selective-Interleaved Generation with Multi-Attribute Tokens
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 186
DNF-SR: Dual-Input and Negative-Aware Feature Fine-Tuning for Real-World Image Super-Resolution
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 187
IFCSR: Inference-Free Fidelity-Realism Control for One-Step Diffusion-based Real-World Image Super-Resolution
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 188
Edge-Focused Super-Resolution for Omnidirectional Images with Spherical Geometric Augmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 189
TUDSR: Twice Upsampling-Diffusion for Higher Super-Resolution
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 190
PS-SR: Pseudo-Single-Step Video Super-Resolution via Speculative Diffusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 191
Disentangled Textual Priors for Diffusion-based Image Super-Resolution
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 192
Remote Sensing Image Super-Resolution for Imbalanced Textures: A Texture-Aware Diffusion Framework
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 193
Rethinking Diffusion Model-Based Video Super-Resolution: Leveraging Dense Guidance from Aligned Features
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 194
DreamSR: Towards Ultra-High-Resolution Image Super-Resolution via a Receptive-Field Enhanced Diffusion Transformer
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 195
FiDeSR: High-Fidelity and Detail-Preserving One-Step Diffusion Super-Resolution
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 196
STCDiT: Spatio-Temporally Consistent Diffusion Transformer for High-Quality Video Super-Resolution
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 197
Towards Highly-Constrained Human Motion Generation with Retrieval-Guided Diffusion Noise Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 198
Learning to Control Physically-simulated 3D Characters via Generating and Mimicking 2D Motions
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 199
Human Geometry Distribution for 3D Animation Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 200
A Temporal and Content Co-Awareness Latent Diffusion for Controllable Hand Image Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 201
Superman: Unifying Skeleton and Vision for Human Motion Perception and Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 202
Learning to Assist: Physics-Grounded Human-Human Control via Multi-Agent Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 203
Stability-Driven Motion Generation for Object-Guided Human-Human Co-Manipulation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 204
Causal Motion Diffusion Models for Autoregressive Motion Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 205
Towards Storytelling Animations: Joint Synthesis of Human and Camera Motions
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 206
MoLingo: Motion–Language Alignment for Text-to-Human Motion Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 207
End-to-End Language-Action Model for Humanoid Whole Body Control
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 208
Toward Early Quality Assessment of Text-to-Image Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 209
CoD: A Diffusion Foundation Model for Image Compression
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 210
Diffusion MRI Transformer with a Diffusion Space Rotary Positional Embedding (D-RoPE)
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 211
Language-Guided One-Step Diffusion Model for Nighttime Flare Removal
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 212
SpiralDiff: Spiral Diffusion with LoRA for RGB-to-RAW Conversion Across Cameras
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 213
PnP-CM: Consistency Models as Plug-and-Play Priors for Inverse Problems
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 214
Landscape-Awareness for Geometric View Diffusion Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 215
Otil: Accelerating Diffusion Model Inference via Communication-Efficient Multi-GPU Parallelism
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 216
REACH: Explicit Recovery Behavior for Diffusion Policies
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 217
OralGPT-Omni: A Versatile Dental Multimodal Large Language Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 218
CrossHOI-Bench: A Unified Benchmark for HOI Evaluation across Vision-Language Models and HOI-Specific Methods
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 219
The LLM Bottleneck: Why Open-Source Vision LLMs Struggle with Hierarchical Visual Recognition
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 220
Do Vision-Language Models Measure Up? Benchmarking Visual Measurement Reading with MeasureBench
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 222
Beyond Single Images: A Comprehensive Benchmark for Album-Level Vision-Language Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 223
LIBERO-Plus: A Progressive Robustness Benchmark for Visual-Language-Action Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 224
Scenes as Tokens: Multi-Scale Normal Distributions Transform Tokenizer for General 3D Vision–Language Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 226
Hear you are: Teaching LLMs Spatial Reasoning with Vision and Spatial Sound
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 227
EgoMind: Activating Spatial Cognition through Linguistic Reasoning in MLLMs
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 228
SAQN: Semantic-based Adaptive Query Network for 3D Referring Expression Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 229
EagleVision: A Dual-Stage Framework with BEV-grounding-based Chain-of-Thought for Spatial Intelligence
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 230
Abstract 3D Perception for Spatial Intelligence in Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 231
PV-Ground: Text-Guided Point-Voxel Interaction for 3D Visual Grounding
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 232
Masking Matters: Unlocking the Spatial Reasoning Capabilities of LLMs for 3D Scene-Language Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 233
SpatialStack: Layered Geometry-Language Fusion for 3D VLM Spatial Reasoning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 235
PARSE: Part-Aware Relational Spatial Modeling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 237
MCHDoc: A Comprehensive Benchmark for Reading Multi-Carrier Chinese Historical Documents
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 238
Cross-modal Fuzzy Alignment Network for Text-Aerial Person Retrieval and A Large-scale Benchmark
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 239
CodeMMR: Bridging Natural Language, Code, and Image for Unified Retrieval
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 240
DiT-Distill: Open-Set Fine-Grained Retrieval via Generative Curriculum Knowledge
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 241
ReCALL: Recalibrating Capability Degradation for MLLM-based Composed Image Retrieval
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 242
Love Me, Love My Label: Rethinking the Role of Labels in Prompt Retrieval for Visual In-Context Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 243
Rethinking BCE Loss for Multi-Label Image Recognition with Fine-Tuning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 244
CAST: Context-Aware Dynamic Latent Space Transformation for Interactive Text-to-Image Retrieval
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 245
PriVi: Towards a General-Purpose Video Model for Primate Behavior in the Wild
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 246
Seeing Conversations: Communication Context Identification in Egocentric Video
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 247
Interactive Episodic Memory with User Feedback
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 248
Seeing without Pixels: Perception from Camera Trajectories
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 249
PFGNet: A Fully Convolutional Frequency-Guided Peripheral Gating Network for Efficient Spatiotemporal Predictive Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 250
Minerva-Ego: Spatiotemporal Hints for Egocentric Video Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 251
StreamRAG: Enhancing Real-Time Video Understanding with Retrieval Augmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 252
ViKey: Enhancing Temporal Understanding in Videos via Visual Prompting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 253
SkillSight: Efficient First-Person Skill Assessment with Gaze
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 254
BriMA: Bridged Modality Adaptation for Multi-Modal Continual Action Quality Assessment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 255
Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 256
MedLIME: A Distribution-Aligned and Evidence-Supported Framework for Medical Saliency Explanations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 257
Inside-Out: Measuring Generalization in Vision Transformers Through Inner Workings
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 258
Language Models Can Explain Visual Features via Steering
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 259
Making the Classification Explanation Faithful to the Confidence Score
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 260
Intrinsic Concept Extraction Based on Compositional Interpretability
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 261
Attribution-Guided Model Rectification of Unreliable Neural Network Behaviors
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 262
Measuring the (Un)Faithfulness of Concept-Based Explanations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 263
Deformation-based In-Context Learning for Point Cloud Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 265
ESAM++: Efficient Online 3D Perception on the Edge
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 266
DualReg: Dual-Space Filtering and Reinforcement for Rigid Registration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 268
Rethinking 2D-3D Registration: A Novel Network for High-Value Zone Selection and Representation Consistency Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 269
Adaptive 3D Perception for Small Aerial Targets Under Sparse Sampling via Reinforcement Learning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 270
3D sans 3D Scans: Scalable Pre-training from Video-Generated Point Clouds
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 271
StreamVLO: Streaming Visual–LiDAR Odometry with Cumulative Drift Compensation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 272
Mamba Learns in Context: Structure-Aware Domain Generalization for Multi-Task Point Cloud Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 273
Routing on Demand: DSNet for Efficient Progressive Point Cloud Denoising
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 274
Hyper-PCN: Hypergraph-Based Point Cloud Completion via High-Order Correlation Modeling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 275
Towards Calibrating Prompt Tuning of Vision- Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 276
DEVA: Fine-tuning Multimodal Large Language Models for Visual Perception Tasks
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 277
LOREAL: Mitigating Low-Resolution Challenges in Vision-Language Models with Attribute-driven Prompt Self-Distillation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 278
OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 279
Language-guided Frequency Modulation for Large Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 280
TANGO: Text-Anchored Guided Optimization for Robust Fine-tuning Vision-Language Models under Label Noise
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 281
Cluster-Wise Spatio-Temporal Masking for Efficient Video-Language Pretraining
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 282
Reconstructing CLIP for Open-Vocabulary Dense Perception
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 283
DPL: Decoupled Prototype Learning for Enhancing Robustness of Vision–Language Transformers to Missing Modalities
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 284
BrepVGAE: Variational Graph Autoencoder with Unified Latent Representation for B-rep
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 285
NeuROK: Generative 4D Neural Object Kinematics
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 286
BrickNet: Graph-Backed Generative Brick Assembly
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 287
Unified Vector Floorplan Generation via Markup Representation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 288
CME-CAD: Heterogeneous Collaborative Multi-Expert Reinforcement Learning for CAD Code Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 289
Robo-SGG: Exploiting Layout-Oriented Normalization and Restitution Can Improve Robust Scene Graph Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 290
OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 291
EpiAgent: An Agent-Centric System for Ancient Inscription Restoration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 294
Image-based Outlier Synthesis With Training Data
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 295
SALMUBench: A Benchmark for Sensitive Association-Level Multimodal Unlearning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 297
When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 298
IrisFP: Adversarial-Example-based Model Fingerprinting with Enhanced Uniqueness and Robustness
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 299
Mark4D: Temporally-Consistent Watermarking for 4D Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 300
Machine Unlearning via Adaptive Gradient Reweighting and Multi-stage Objective Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 301
Taming Noise-Induced Prototype Degradation for Privacy-Preserving Personalized Federated Fine-Tuning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 302
FedMOP: Achieving Enhanced Privacy and Performance in Federated Learning via Momentum Orthogonal Projection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 303
HFedATM: Hierarchical Federated Domain Generalization via Optimal Transport and Regularized Mean Aggregation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 304
Single-Round Scalable Analytic Federated Learning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 305
Controllable Federated Prompt Learning at Test Time
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 307
Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 308
Spatial Matters: Position-Guided 3D Referring Expression Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 309
Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 310
Refer-Agent: A Collaborative Multi-Agent System with Reasoning and Reflection for Referring Video Object Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 311
CaptionFormer: Unified Segmentation, Tracking, and Captioning for Spatio-Temporal Objects
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 312
TransPrune: Token Transition Pruning for Efficient Large Vision-Language Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 313
QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 314
Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 315
Collaborative Multi-Mode Pruning for Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 316
ZOO-Prune: Training-Free Token Pruning via Zeroth-Order Gradient Estimation in Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 317
HAWK: Head Importance-Aware Visual Token Pruning in Multimodal Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 318
CORE: Compact Object-centric REpresentations as a New Paradigm for Token Merging in LVLMs
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 319
Imbalanced View Contribution Evaluation and Refinement for Deep Incomplete Multi-View Clustering
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 320
Multi-Hierarchical Contrastive Spectral Fusion for Multi-View Clustering
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 321
SECOS: Semantic Capture for Rigorous Classification in Open-World Semi-Supervised Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 322
Multi-Modal Representation Learning via Semi-Supervised Rate Reduction for Generalized Category Discovery
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 323
TimeBridge: Self-Supervised Video Representation Learning via Start-End Joint Embedding and In-Between Frame Prediction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 324
Mitigating Instance Entanglement in Instance-Dependent Partial Label Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 325
Residual Connections Harm Generative Representation Learning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 326
Neural Mixture Density Processes
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 327
Large-scale Robust Enhanced Ensemble Clustering via Outlier Decoupling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 328
DriveLaW: Unifying Planning and Video Generation in a Latent Driving World
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 329
DLWM: Dual Latent World Models enable Holistic Gaussian-centric Pre-training in Autonomous Driving
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 330
Latent Chain-of-Thought World Modeling for End-to-End Driving
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 332
TrafficAlign: Aligning Large Language Models for Traffic Scenario Generation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 333
Failure Modes for Deep Learning–Based Online Mapping: How to Measure and Address Them
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 334
Linking Modality Isolation in Heterogeneous Collaborative Perception
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 335
LEAD: Minimizing Learner-Expert Asymmetry in End-to-End Driving
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 336
DriverGaze360: OmniDirectional Driver Attention with Object-Level Guidance
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 337
Diffusion Forcing Planner: History-Annealed Planning with Time-Dependent Guidance for Autonomous Driving
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 338
DIMOS: Disentangling Instance-level Moving Object Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 339
EvObj: Learning Evolving Object-centric Representations for 3D Instance Segmentation without Scene Supervision
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 340
Live Interactive Training for Video Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 341
Robust Promptable Video Object Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 342
Scene-VLM: Multimodal Video Scene Segmentation via Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 343
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 344
BEV-CAR: Enhancing Monocular Bird’s Eye View Segmentation with Context-Aware Rasterization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 345
Exploring the Underwater World Segmentation without Extra Training
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 347
Cross-Architecture Adaptation: Cloud-Edge Continual Test-Time Adaptation with Dynamic Sampling and Heterogeneous Distillation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 348
Towards Dynamic Modality Alignment in Multimodal Continual Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 350
Incremental Object Detection via Future-Aware Decoupled Cross-Head Distillation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 351
Smart Replay: Adaptive Scheduling of Memory Rehearsal for Computational Resource-Aware Incremental Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 352
ReBaPL: Repulsive Bayesian Prompt Learning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 353
Spectral Mixture-of-Experts for Continual Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 354
ActAvatar: Temporally-Aware Precise Action Control for Talking Avatars
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 355
ViBES: A Conversational Agent with Behaviorally-Intelligent 3D Virtual Body
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 356
DeX-Portrait: Disentangled and Expressive Portrait Animation via Explicit and Latent Motion Representations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 357
SketchFaceGS: Real-Time Sketch-Driven Face Editing and Generation with Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 358
MIBURI: Towards Expressive Interactive Gesture Synthesis
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 359
Personalized Image Descriptions from Attention Sequences
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 360
GA-VLN: Geometry-Aware BEV Representation for Efficient Vision-Language Navigation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 361
IMAIA: Interactive Maps AI Assistant for Travel Planning and Geo-Spatial Intelligence
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 362
OctoNav: Towards Generalist Embodied Navigation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 363
WalkGPT: Grounded Vision–Language Conversation with Depth-Aware Segmentation for Pedestrian Navigation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 364
SpaceDrive: Infusing Spatial Awareness into VLM-based Autonomous Driving
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 365
SMAP: Semantic Route Planning with Map-Grounded Multimodal Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 367
Fresco: Frequency–Spatial Consistent Optimization for Fine-Grained Head Avatar Modeling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 368
Motion-Aware Animatable Gaussian Avatars Deblurring
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 369
ELITE: Efficient Gaussian Head Avatar from a Monocular Video via Learned Initialization and Test-time Generative Adaptation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 370
Multi-view Consistent 3D Gaussian Head Avatars 'without' Multi-view Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 371
MAD: Modality-Adaptive Decoding for Mitigating Cross-Modal Hallucinations in Multimodal Large Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 372
Cross-Modal Attention Calibration for LVLM Hallucination Mitigation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 373
3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 374
Exposing and Evaluating Hallucinations for GUI Grounding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 375
Understanding and Mitigating Hallucinations in Multimodal Chain-of-Thought Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 376
Beyond the Global Scores: Fine-Grained Token Grounding as a Robust Detector of LVLM Hallucinations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 377
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 378
Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 379
AniMimic: Imitating 3D Animation from Video Priors
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 380
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 381
ScenDi: 3D-to-2D Scene Diffusion Cascades for Urban Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 382
MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 383
GeodesicNVS: Probability Density Geodesic Flow Matching for Novel View Synthesis
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 384
WorldStereo: Bridging Controllable Video Generation and Scene Reconstruction via 3D Geometric Memories
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 385
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 386
Taming Video Models for 3D and 4D Generation via Zero-Shot Camera Control
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 387
Improving Motion in Image-to-Video Models via Adaptive Low-Pass Guidance
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 388
SANER: Switchable Adapter with Non-parametric Enhanced Routing for Person De-Reidentification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 389
BIT: Matching-based Bi-directional Interaction Transformation Network for Visible-Infrared Person Re-Identification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 390
Vision-Language Attribute Disentanglement and Reinforcement for Lifelong Person Re-Identification
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 391
Diversity over Uniformity: Rethinking Representation in Generated Image Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 392
Mining Instance-Centric Vision–Language Contexts for Human–Object Interaction Detection
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 393
FSLoRA: Harmonizing Detection and Re-Identification via Freq-Spatial Low-Rank Adapter for One-Stage Person Search
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 394
EEGiT: Teaching Vision Transformers to Understand the EEG signal
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 395
FedBPrompt: Federated Domain Generalization Person Re-Identification via Body Distribution Aware Visual Prompts
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 396
Pose-guided Enriched Feature Learning for Federated-by-camera Person Re-identification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 397
UAV-CB: A Complex-Background RGB–T Dataset and Local Frequency Bridge Network for UAV Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 398
TimeViper: A Hybrid Mamba-Transformer Vision-Language Model for Efficient Long Video Understanding
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 400
LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 401
Agentic Video Summarization via Self-Reflecting Multimodal Understanding
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 402
Self-Critical Distillation Network for Video-based Commonsense Captioning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 403
Ego-Grounding for Personalized Question-Answering in Egocentric Videos
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 404
AdaSpark: Adaptive Sparsity for Efficient Long-Video Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 405
EarlyTom: Early Token Compression Completes Fast Video Understanding
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 406
VideoWorld 2: Learning Transferable Knowledge from Real-world Videos
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 407
VirtueBench: Evaluating Trustworthiness under Uncertainty in Long Video Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 408
DiverseDiT: Towards Diverse Representation Learning in Diffusion Transformers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 409
RenderFlow: Single-Step Neural Rendering via Flow Matching
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 410
ResDiT: Evoking the Intrinsic Resolution Scalability in Diffusion Transformers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 411
Masked Region Transformer for Layered Image Generation and Editing at Scale
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 412
DDT: Decoupled Diffusion Transformer
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 413
Just-in-Time: Training-Free Spatial Acceleration for Diffusion Transformers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 414
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 415
ShapeAR: Generating Editable Shape Layers via Autoregressive Diffusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 417
RecTok: Reconstruction Distillation along Rectified Flow
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 418
EgoXtreme: A Dataset for Robust Object Pose Estimation in Egocentric Views under Extreme Conditions
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 419
CoIn3D: Revisiting Configuration-Invariant Multi-Camera 3D Object Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 420
H^2A^2: Homogeneity-Aware and Heterogeneity-Aware Feature Perception for Unified Indoor 3D Object Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 421
Cov2Pose: Leveraging Spatial Covariance for Direct Manifold-aware 6-DoF Object Pose Estimation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 422
Towards Intrinsic-Aware Monocular 3D Object Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 423
SToRe3D: Sparse Token Relevance in ViTs for Efficient Multi-View 3D Object Detection
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 424
SPAN: Spatial-Projection Alignment for Monocular 3D Object Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 425
DSCA: Dynamic Subspace Concept Alignment for Lifelong VLM Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 426
FailureAtlas: Mapping the Failure Landscape of T2I Models via Active Exploration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 427
HDR-VLM: HDR-Domain Adaptation of VLMs and Preference-Aligned Quality Assessment for HDR Video Color Grading
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 428
RobustVisRAG: Causality-Aware Vision-Based Retrieval-Augmented Generation under Visual Degradations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 429
BiomedCCPL: Causal Conditional Prompt Learning for Biomedical Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 430
DynamicGTR: Leveraging Graph Topology Representation Preferences to Boost VLM Capabilities on Graph QAs
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 431
VisualOverload: Probing Visual Understanding of VLMs in Really Dense Scenes
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 432
Revisiting Visual Corruptions in LVLMs: A Shape–Texture Perspective on Model Failures
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 433
From Intuition to Investigation: A Tool-Augmented Reasoning MLLM Framework for Generalizable Face Anti-Spoofing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 434
Trust-calibrated Collaborative Learning for Long-Tailed Visual Recognition
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 435
SunFaded: Illumination-Aware Gaussian Splatting for Dark Scenes with Camera-Mounted Active Lighting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 436
TokenSplat: Token-aligned 3D Gaussian Splatting for Feed-forward Pose-free Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 437
GOR-IS: 3D Gaussian Object Removal In the Intrinsic Space
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 438
AeroGS: Scale-Aware Gaussian Splatting for Pose-Free Dynamic UAV Scene Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 439
Intrinsic Geometry-Appearance Consistency Optimization for Sparse-View Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 440
AERGS-SLAM: Auto-Exposure-Robust Stereo 3D Gaussian Splatting SLAM
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 441
Learning Differentiable Hierarchies in 3D Gaussian Splatting
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 442
WeatherCity: Urban Scene Reconstruction with Controllable Multi-Weather Transformation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 443
Cross-View Splatter: Feed-Forward View Synthesis with Georeferenced Images
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 444
TagSplat: Topology-Aware Gaussian Splatting for Dynamic Mesh Modeling and Tracking
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 445
Hierarchical Visual Relocalization with Nearest View Synthesis from Feature Gaussian Splatting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 446
Tracking-Guided 4D Generation: Foundation-Tracker Motion Priors for 3D Model Animation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 447
3D Gaussian Splatting from Unposed Spike Stream
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 448
SparseOIT: Improving Order-Independent Transparency 3DGS via Active Set Method
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 449
ClipGStream: Clip-Stream Gaussian Splatting for Any Length and Any Motion Multi-View Dynamic Scene Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 450
Space-Time Forecasting of Dynamic Scenes with Motion-aware Gaussian Grouping
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 451
MoRGS: Efficient Per-Gaussian Motion Reasoning for Streamable Dynamic 3D Scenes
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 452
BEA-GS: BEyond RAdiance Supervision in 3DGS for Precise Object Extraction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 453
EDGS: Eliminating Densification for Efficient Convergence of 3DGS
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 454
ReasonMap: Towards Fine-Grained Visual Reasoning from Transit Maps
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 455
Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 456
DialogueVPR: Towards Conversational Visual Place Recognition
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 457
Perceptual-Evidence Anchored Reinforced Learning for Multimodal Reasoning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 458
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 459
VinQA: Visual Elements Interleaved Long-form Answer Generation for Real-World Multimodal Document QA
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 460
DocSeeker: Structured Visual Reasoning with Evidence Grounding for Long Document Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 462
VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 463
Grounding Everything in Tokens for Multimodal Large Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 464
Evolving Contextual Safety in Multi-Modal Large Language Models via Inference-Time Self-Reflective Memory
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 465
ChartR: Evaluating Reasoning Accuracy and Robustness in Chart Question Answering
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 466
Think Visually, Reason Textually: Vision-Language Synergy in Abstract Reasoning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 467
VKG-QA: Visual Knowledge Graph-based Question Answer for Large Multimodal Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 468
Med-CMR: A Fine-Grained Benchmark Integrating Visual Evidence and Clinical Logic for Medical Complex Multimodal Reasoning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 470
VITAL: Vision-Encoder-centered Pre-training for LMMs in Visual Quality Assessment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 471
Generative Video Compression with One-Dimensional Latent Representation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 472
Markovian Scale Prediction: A New Era of Visual Autoregressive Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 473
Learned Image Compression via Sparse Attention and Adaptive Frequency
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 474
UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 475
VecAttention: Vector-wise Sparse Attention for Accelerating Long Context Inference
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 476
Ultra-Fast Neural Video Compression
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 477
Parallax to Align Them All: An OmniParallax Attention Mechanism for Distributed Multi-View Image Compression
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 478
Scaling Parallel Sequence Models to Vision Foundation Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 479
Revisiting Model Stitching In the Foundation Model Era
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 480
GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 481
VLM-Loc: Localization in Point Cloud Maps via Vision-Language Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 482
HOLO: Homography-Guided Pose Estimator Network for Fine-Grained Visual Localization on SD Maps
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 483
TriLite: Efficient Weakly Supervised Object Localization with Universal Visual Features and Tri-Region Disentanglement
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 484
GeoSURGE: Geo-localization using Semantic Fusion with Hierarchy of Geographic Embeddings
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 486
OVOD-Agent: A Markov–Bandit Framework for Proactive Visual Reasoning and Self-Evolving Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 487
Pixel2Phys: Distilling Governing Laws from Visual Dynamics
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 489
Seeing as Experts Do: A Knowledge-Augmented Agent for Open-Set Fine-Grained Visual Understanding
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 490
Dynamic Important Example Mining for Reinforcement Finetuning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 491
Specificity-aware reinforcement learning for fine-grained open-world classification
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 492
SAGE: Training Smart Any-Horizon Agents for Long Video Reasoning with Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 493
Uncertainty-Aware Modality Fusion for Unaligned RGB-T Salient Object Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 494
Fusion in Your Way: Aligning Image Fusion with Heterogeneous Demands via Direct Preference Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 495
More Than Meets the Eye: A Unified Image Fusion Framework via Semantic-Pixel Entropy Trade-off for Zero-Shot Generalization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 496
Beyond Sequential Tools: A Unified VLM Agent System for Photographic Post-Processing via Dynamic Multi-Expert Fusion
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 497
Multi-modal Frequency Decomposition Network for Semantic Scene Completion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 498
BiEvLight: Bi-level Learning of Task-Aware Event Refinement for Low-Light Image Enhancement
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 499
FusionRegister: Every Infrared and Visible Image Fusion Deserves Registration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 500
OmniFood8K: Single-Image Nutrition Estimation via Hierarchical Frequency-Aligned Fusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 501
Enhancing Unregistered Hyperspectral Image Super-Resolution via Unmixing-based Abundance Fusion Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 502
LRHDR: Learning Representation-enhanced HDR Video Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 503
Cross-Domain Few-Shot Segmentation via Multi-view Progressive Adaptation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 504
Interpretable Cross-Domain Few-Shot Learning with Rectified Target-Domain Local Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 505
PP-Brep: Few-Shot B-rep Classification with Hybrid Graph Representation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 506
AgentDet: A Shared-Blackboard Multi-Agent Framework for Zero-/Few-Shot Object Detection
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 507
SFR-Net: Steering-Fusion-Refining Network in Multi-label Zero-Shot Sewer Defect Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 508
Noise-Aware Few-Shot Learning through Bi-directional Multi-View Prompt Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 510
Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 511
Progressive Mask Distillation for Self-supervised Video Representation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 512
HierAmp: Coarse-to-Fine Autoregressive Amplification for Generative Dataset Distillation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 513
SpiderCam: Low-Power Snapshot Depth from Differential Defocus
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 514
Computational Speckle Pattern Interferometry
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 515
DetectSCI: Toward Object-Guided ROI Reconstruction for High-Resolution Video Snapshot Compressive Imaging
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 516
Solving a Nonlinear Blind Inverse Problem for Tagged MRI with Physics and Deep Generative Priors
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 517
Nonlinear Color Transfer via Learnable Bezier Flows
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 518
VT-Intrinsic: Physics-Based Decomposition of Reflectance and Shading using a Single Visible-Thermal Image Pair
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 520
Computer Vision with a Superpixelation Camera
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 521
Color-Encoded Illumination for High-Speed Volumetric Scene Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 522
Multi-Scale Gradient-Guided Unrolling Architecture with Adaptive Mamba for Compressive Sensing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 523
Deciphering Genotype-Phenotype Mechanisms from High-Content Profiling via Knowledge-Guided Multi-modal Graph Learning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 524
Bulk RNA-seq Guided Multi-modal Detection of Anomalous Regions in Human Cancer via Spatial Transcriptomics
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 525
Intervention-Aware Multiscale Representation Learning from Imaging Phenomics and Perturbation Transcriptomics
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 526
ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information Interaction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 528
PromptLoop: Plug-and-Play Prompt Refinement via Latent Feedback for Diffusion Model Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 529
EvoID: Reinforced Evolution for Identity-Preserving Video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 530
Masked Auto-Regressive Variational Acceleration: Fast Inference Makes Practical Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 531
PhyCo: Learning Controllable Physical Priors for Generative Motion
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 532
Unified Multimodal Models as Auto-Encoders
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 533
Expand and Prune: Maximizing Trajectory Diversity for Effective GRPO in Generative Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 535
Drainage: A Unifying Framework for Addressing Class Uncertainty
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 536
Neural Differentiation in Deep Networks: A Theoretical Framework for Expressivity and Representational Diversity
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 537
DuetMerging: Synergizing Dynamic and Static Strategies for Mitigating Task Interference in Model Merging
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 538
SASNet: Spatially-Adaptive Sinusoidal Networks for INRs
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 539
Generative Modeling of Weights: Generalization or Memorization?
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 540
Vision-Oriented Lightweight Neural Architecture Search with Budget-Adaptive Evaluation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 542
Stepwise Credit Assignment for GRPO on Flow-Matching Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 543
FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 546
Image-to-Point Cloud Feature Back-Projection for Multimodal Training of 3D Semantic Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 547
NG-GS: NeRF-guided 3D Gaussian Splatting Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 548
Teaching DINOv3 About Partial 3D Geometry: A Self-Supervised Geometry-Aware Approach
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 549
SemLayer: Semantic-aware Generative Segmentation and Layer Construction for Abstract Icons
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 550
MatchED: Crisp Edge Detection Using End-to-End, Matching-based Supervision
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 551
SegGBC: Justifiable Coarse-to-Fine Granular-Ball Computing for Enhancing Clustering Image Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 552
Seeing Beyond: Extrapolative Domain Adaptive Panoramic Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 553
MatchMask: Mask-Centric Generative Data Augmentation for Label-Scarce Semantic Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 554
Boundary-Responsive Differentiable Gating for Superpixel-Based Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 555
Task-Oriented Data Synthesis and Control-Rectify Sampling for Remote Sensing Semantic Segmentation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 556
FUSAR-GPT: A Spatiotemporal Feature-Embedded and Two-Stage Decoupled Visual Language Model for SAR Imagery
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 557
UniChange: Unifying Change Detection with Multimodal Large Language Model
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 558
Spatiotemporal Pyramid Flow Matching for Climate Emulation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 559
See What We Cannot See: A Geo-guided Reasoning Benchmark for Object Counting under Adverse Earth Observation Conditions
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 560
MM-OVSeg: Multimodal Optical–SAR Fusion for Open-Vocabulary Segmentation in Remote Sensing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 561
RECS4R: Bridging Semantics and Geometry for Referring Remote Sensing Interpretation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 562
Fourier Angle Alignment for Oriented Object Detection in Remote Sensing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 563
Learning to Infer Parameterized Representations of Plants from 3D Scans
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 564
Good Can Sometimes be Bad: A Unified Attack against 3D Point Cloud Classifier by a Flexible Isotropic Resampling
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 565
V-Attack: Targeting Disentangled Value Features for Controllable Adversarial Attacks on LVLMs
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 566
FeatureFool: Zero-Query Fooling of Video Models via Feature Map
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 567
RankOOD - Class Ranking-based Out-of-Distribution Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 568
AdvFM: Lookahead Flow-Matching Velocity-Field Attacks for Imperceptible and Transferable Adversarial Examples
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 569
The Power of Decaying Steps: Enhancing Attack Stability and Transferability for Sign-based Optimizers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 570
Your Classifier Can Do More: Towards Balancing the Gaps in Classification, Robustness, and Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 571
Learning Mutual View Information Graph for Adaptive Adversarial Collaborative Perception
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 572
Hierarchical Attacks for Multi‑Modal Multi‑Agent Reasoning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 573
Omni-Attack: Adversarial Attacks on Open-Ended VQA in Black-Box Multimodal LLMs
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 574
CoMo: Learning Continuous Latent Motion from Internet Videos for Scalable Robot Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 575
Δynamics: Language-Based Representation for Inferring Rigid-Body Dynamics From Videos
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 576
PvP: Data-Efficient Humanoid Robot Learning with Proprioceptive-Privileged Contrastive Representations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 577
Diagnose, Correct, and Learn from Manipulation Failures via Visual Symbols
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 579
GeCo-SRT: Geometry-aware Continual Adaptation for Cross-Task Sim-to-Real Transfer
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 580
ActiveGrasp: Information-Guided Active Grasping with Calibrated Energy-based Model
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 581
BiPreManip: Learning Affordance-Based Bimanual Pre-Manipulation through Anticipatory Collaboration
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 582
Learning Surgical Robotic Manipulation with 3D Spatial Priors
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 583
SimRecon: SimReady Compositional Scene Reconstruction from Real Videos
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 584
STRNet: Visual Navigation with Spatio-Temporal Representation through Dynamic Graph Aggregation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 585
RaUF: Learning the Spatial Uncertainty Field of Radar
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 586
SIR: Structured Image Representations for Explainable Robot Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 587
Instance-level Visual Active Tracking with Occlusion-Aware Planning
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 588
Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 589
AnthroTAP: Learning Point Tracking with Real-World Motion
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 590
Tracking by Predicting 3-D Gaussians Over Time
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 591
Toward Low-Cost yet Effective Temporal Learning for UAV Tracking
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 592
Rethinking Two-Stage Referring-by-Tracking in Referring Multi-Object Tracking: Make it Strong Again
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 593
Occlusion-Aware SORT: Observing Occlusion for Robust Multi-Object Tracking
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 594
CoWTracker: Tracking by Warping instead of Correlation
[
Slides]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 595
Learning Long-term Motion Embeddings for Efficient Kinematics Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 596
SpatialVID: A Large-Scale Video Dataset with Spatial Annotations
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 597
Beyond Explicit Language: Plug-and-Play Visual-to-Linguistic Modeling Toward General Object Tracking
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 598
FairLLaVA: Fairness-Aware Parameter-Efficient Fine-Tuning for Large Vision-Language Assistants
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 599
InvCoSS: Inversion-driven Continual Self-supervised Learning in Medical Multi-modal Image Pre-training
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 600
PETAR: Localized Findings Generation with Mask-Aware Vision-Language Modeling for PET Automated Reporting
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 601
From Panel to Pixel: Zoom-In Vision–Language Pretraining from Biomedical Scientific Literature
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 602
LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 603
D2T2 - Multimodal Automated Planning for Brachytherapy
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 604
TopoCL: Topological Contrastive Learning for Medical Imaging
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 605
Diffusion with a Linguistic Compass: Steering the Generation of Clinically Plausible Future sMRI Representations for Early MCI Conversion Prediction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 606
Personalized Longitudinal Medical Report Generation via Temporally-Aware Federated Adaptation
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 607
Decoding 3D Perception via BrainSSD: Synergistic Fusion of EEG Representations from Static and Dynamic Visual Streams
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 608
Duala: Dual-Level Alignment of Subjects and Stimuli for Cross-Subject fMRI Decoding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 609
OmniBrainBench: A Comprehensive Multimodal Benchmark for Brain Imaging Analysis Across Multi-stage Clinical Tasks
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 610
Beyond Pixel Simulation: Pathology Image Generation via Diagnostic Semantic Tokens and Prototype Control
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 611
MedFG-VQA: Low-Frequency Memory and Graph Attention for Lightweight Medical VQA
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 612
FISHuman: Fine-grained Single-image 3D Human Reconstruction via Multi-view 4D Remeshing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 613
DuoMo: Dual Motion Diffusion for World-Space Human Reconstruction
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 614
RAM: Recover Any 3D Human Motion in-the-Wild
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 615
From 2D Alignment to 3D Plausibility: Unifying Heterogeneous 2D Priors and Penetration-Free Diffusion for Occlusion-Robust Two-Hand Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 616
MV-Fashion: Towards Enabling Virtual Try-On and Size Estimation with Multi-View Paired Data
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 617
Forecasting 3D Scanpaths in Egocentric Video
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 618
M4Human: A Large-Scale Multimodal mmWave Radar Benchmark for Human Mesh Reconstruction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 619
ReGenHOI: Unifying Reconstruction and Generation for 3D Human–Object Interaction Understanding
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 620
Through the Frequency Lens: Cross-Domain Generalisable Gaze Estimation with Adaptive Modulation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 621
Mocap-2-to-3: Multi-view Lifting for Monocular Motion Recovery with 2D Pretraining
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 622
SHands: A Multi-View Dataset and Benchmark for Surgical Hand-Gesture and Error Recognition Toward Medical Training
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 623
Beyond Static Frames: Temporal Aggregate-and-Restore Vision Transformer for Human Pose Estimation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 624
IMU-HOI: A Symbiotic Framework for Coherent Human-Object Interaction and Motion Capture via Contact-Conscious Inertial Fusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 625
Learning Forgery-Aware Lip Representations Without Forgery Priors
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 626
Beyond [CLS] Token: Query-Driven Token-Level Forgery Purification for Generalizable Deepfake Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 627
GEM-TFL: Bridging Weak and Full Supervision for Forgery Localization through EM-Guided Decomposition and Temporal Refinement
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 628
TokenTrace: Multi-Concept Attribution through Watermarked Token Recovery
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 629
Unleashing Vision-Language Semantics for Deepfake Video Detection
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 630
A Difference-in-Difference Approach to Detecting AI-Generated Images
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 631
RDFace: A Benchmark Dataset for Rare Disease Facial Image Analysis under Extreme Data Scarcity and Phenotype-Aware Synthetic Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 632
ActivityForensics: A Comprehensive Benchmark for Localizing Manipulated Activity in Videos
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 633
Zero-shot Detection of AI-Generated Image via RAW-RGB Alignment
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 634
Scaling Up AI-Generated Image Detection with Generator-Aware Prototypes
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 635
Investigating Self-Supervised Representations for Audio-Visual Deepfake Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 636
TIACam: Text-Anchored Invariant Feature Learning with Auto-Augmentation for Camera-Robust Zero-Watermarking
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 637
FastRef: Fast Prototype Refinement for Few-shot Industrial Anomaly Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 638
RC-NF: Robot-Conditioned Normalizing Flow for Real-Time Anomaly Detection in Robotic Manipulation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 639
Reasoning-Driven Anomaly Detection and Localization with Image-Level Supervision
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 641
Wavelet-Driven 3D Anomaly Detection under Pose-Agnostic and Sparse-View
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 642
Hunting Normality from Query Sample via Residual Learning for Generalist Anomaly Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 643
GPFlow: Gaussian Prototype Probability Flow for Unsupervised Multi-Modal Anomaly Detection
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 644
HP-Edit: A Human-Preference Post-Training Framework for Image Editing
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 645
It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 646
RebRL: Reinforcing Discrete Visual Diffusion Models with Rebalanced Timestep Credits
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 647
Ego-InBetween: Generating Object State Transitions in Ego-Centric Videos
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 648
Towards Fine-Grained Attribution: Instance-Aware Preference Optimization for Aligning Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 649
SketchRevive: Fine-Grained Pixel-to-Vector Sketch Completion with Diffusion-Prior-Guided Multimodal LLMs
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 650
UniPercept: A Unified Diffusion Model for Generalizable Visual Perception
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 651
Visual Diffusion Models are Geometric Solvers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 652
You Only Erase Once: Erasing Anything without Bringing Unexpected Content
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 653
Smoothing the Score Function to Enhance Generalization in Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 654
NS-Diff: Fluid Navier–Stokes Guided Video Diffusion via Reinforcement Learning
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 655
PropFly: Learning to Propagate via On-the-Fly Supervision from Pre-trained Video Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 656
Generative Neural Video Compression via Video Diffusion Prior
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 657
AdaCluster: Adaptive Query-Key Clustering for Sparse Attention in Video Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 658
Denoising, Fast and Slow: Difficulty-Aware Adaptive Sampling for Image Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 659
Image Diffusion Preview with Consistency Solver
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 660
The Drift Kernel: Why Diffusion Models Change Even When Told Not To
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 661
Interpretable Prompts made Edit-Friendly: Token-to-Token Similarity Reduction in dLLMs for Edit-Friendly Hard Prompt Inversion
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 662
LESA: Learnable Stage-Aware Predictors for Diffusion Model Acceleration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 663
Vision Foundation Models Can Be Good Tokenizers for Latent Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 664
Adaptive Spectral Feature Forecasting for Diffusion Sampling Acceleration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 665
Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 666
EasyOmnimatte: Taming Pretrained Inpainting Diffusion Models for End-to-End Video Layered Decompositio
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 667
Hierarchical Codec Diffusion for Video-to-Speech Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 668
Semantic Alignment for Pose-Invariant Identity Preserving Diffusion
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 669
Causality in Video Diffusers is Separable from Denoising
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 670
2ndMatch: Finetuning Pruned Diffusion Models via Second-Order Jacobian Matching
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 671
Hear What You See: Video-to-Audio Generation with Diffusion Transformer and Semantic-Temporal Alignment-Ranked Direct Preference Optimization
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 672
MacTok: Robust Continuous Tokenization for Image Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 673
Group Editing: Edit Multiple Images in One Go
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 674
Adaptive Video Distillation: Mitigating Oversaturation and Temporal Collapse in Few-Step Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 675
Beyond the Golden Data: Resolving the Motion-Vision Quality Dilemma via Timestep Selective Training
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 676
Toward Diffusible High-Dimensional Latent Spaces: A Frequency Perspective
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 677
Elucidating the SNR-t Bias of Diffusion Probabilistic Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 678
What Is It Like to Be a Noise? An Entropy-based Gaussian Noise Regularization for Diffusion Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 679
FlashVSR: Towards Real-time Diffusion-Based Streaming Video Super Resolution
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 680
DiffusionHarmonizer: Bridging Neural Reconstruction and Photorealistic Simulation with Online Diffusion Enhancer
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 681
GDRO: Group-level Reward Post-training Suitable for Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 682
RFDM: Residual Flow Diffusion Models for Video Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 683
FreqEdit: Preserving High-Frequency Features for Robust Multi-Turn Image Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 684
Graph-Guided Online Concept Erasure for Text-to-Image Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 685
HierEdit: Region-Aware Hierarchical Diffusion for Efficient High-Resolution Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 686
CTCal: Rethinking Text-to-Image Diffusion Models via Cross-Timestep Self-Calibration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 687
Edit2Perceive: Image Editing Diffusion Models Are Strong Dense Perceivers
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 688
DeltaQuant: 4-bit Video Diffusion Models with Spatiotemporal Delta Smoothing
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 689
D2Cache: Second-Order Delta Caching for Higher Video Diffusion Acceleration
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 690
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 691
Test-Time Alignment of Text-to-Image Diffusion Models via Null-Text Embedding Optimisation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 692
Accelerating Diffusion Model Training under Minimal Budgets: A Condensation-Based Perspective
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 693
Denoising as Path Planning: Training-Free Acceleration of Diffusion Models with DPCache
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 694
Taming Sampling Perturbations with Variance Expansion Loss for Latent Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 695
Guiding Diffusion Models with Semantically Degraded Conditions
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 696
Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 698
Coupled Diffusion Sampling for Training-Free Multi-View Image Editing
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 699
Improving Diffusion Generalization with Weak-to-Strong Segmented Guidance
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 700
Adaptive Auxiliary Prompt Blending for Target-Faithful Diffusion Generation
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 701
SegQuant: A Semantics-Aware and Generalizable Quantization Framework for Diffusion Models
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 702
BAgger: Backwards Aggregation for Mitigating Drift in Autoregressive Video Diffusion Models
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 703
Accelerating Autoregressive Video Diffusion via History-Guided Cache and Residual Correction
[
Poster]
Poster
Sun Jun 07 02:30 PM -- 04:30 PM (PDT) @ ExHall A 704
MusicInfuser: Making Video Diffusion Listen and Dance
[
Poster]
Successful Page Load