Skip to yearly menu bar Skip to main content


Workshop

3D Vision Language Model for Robotics Manipulation: Opportunities and Challenges

Jiafei Duan · Muhammad Zubair Irshad · Ishika Singh · Vitor Guizilini · Rares Andrei Ambrus · Zsolt Kira

101 A

Wed 11 Jun, 7 a.m. PDT

Keywords:  Robotic Manipulation  

The intersection of 3D Vision-and-Language models (3D VLMs) in robotics presents a new frontier, blending spatial understanding with contextual reasoning. The Robo-3DVLM workshop seeks to explore the opportunities and challenges posed by integrating these technologies to enhance robot perception, decision-making, and interaction with the real world. As robots evolve to operate in increasingly complex environments, bridging the gap between 3D spatial reasoning and language understanding becomes critical. The workshop aims to drive conversations around the utility of 3D in robotic vision, the role of language in perception, and the limitations imposed by current data and hardware constraints. Through invited talks and interactive sessions, we aim to unite researchers from diverse disciplines to push the boundaries of multimodal learning in robotics, setting the stage for the next generation of intelligent systems.

Live content is unavailable. Log in and register to view live content