Poster Sun, Jun 7, 2026 • 2:30 PM – 4:30 PM PDT ExHall A 559

See What We Cannot See: A Geo-guided Reasoning Benchmark for Object Counting under Adverse Earth Observation Conditions

Jiayi Wang ⋅ Zhihong Tan ⋅ Hongchen Wei ⋅ Daiqing Yang ⋅ Zhenzhong Chen

Highlight

Abstract

Object counting in remote sensing imagery becomes challenging when visual cues are obscured by clouds, fog, shadows, or low-light conditions. Yet earth observation inherently provides complementary geo-modalities, including land use and map, which offer stable structural and contextual priors that remain available when appearance cues fail. In this paper, we introduce \textbf{GROC}, the first large-scale dataset \textbf{G}eo-guided \textbf{R}easoning in \textbf{O}bject \textbf{C}ounting under adverse earth observation conditions. GROC contains 1.2 million point annotations over 14K images, each aligned with 3 modalities that preserve original geospatial information. We also provide a data engine to collect a large-scale object counting dataset with multiple geo-modalities, realistic degradations, and reliable annotations. We further present an counting agent that adaptively leverages geo-modalities to produce reliable estimates. Extensive experiments show that existing models struggle to “see” through adverse conditions, whereas geo-modalities improve robustness. GROC establishes the first benchmark that explicitly challenges models to \textbf{see what they cannot see}, charting a new direction for geo-guided amodal reasoning in earth observation.