Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding
Ziyang Wang, Honglu Zhou, Shijie Wang, Junnan Li, Caiming Xiong, Silvio Savarese, Mohit Bansal, Michael S. Ryoo, Juan Carlos Niebles
Keywords:
Vision, Language, and Reasoning
Successful Page Load