CoVCR: Bridging Visual Narrative Gaps via Context Generation for Robust Commonsense Reasoning
Xinyu Li, Shiliang Sun
Keywords:
Vision, Language, and Reasoning
Successful Page Load