DeepSketcher: Internalizing Visual Manipulation for Multimodal Reasoning
Chi Zhang, Haibo Qiu, Qiming Zhang, Zhixiong Zeng, Lin Ma, Jing Zhang
Keywords:
Vision, Language, and Reasoning
Successful Page Load