CVPR Poster Free Lunch Enhancements for Multi-modal Crowd Counting

Poster

Free Lunch Enhancements for Multi-modal Crowd Counting

Haoliang Meng · Xiaopeng Hong · Zhengqin Lai · Miao Shang

ExHall D Poster #322

[ Abstract ] [ Project Page ] [ Paper PDF ]

[ Poster]

Sat 14 Jun 8:30 a.m. PDT — 10:30 a.m. PDT

Abstract:

This paper addresses multi-modal crowd counting with a novel 'free lunch' training enhancement strategy that requires no additional data, parameters, or increased inference complexity. First, we introduce a cross-modal alignment technique as a plug-in post-processing step for the pre-trained backbone network, enhancing the model’s ability to capture shared information across modalities. Second, we incorporate a regional density supervision mechanism during the fine-tuning stage, which differentiates features in regions with varying crowd densities. Extensive experiments on three multi-modal crowd counting datasets validate our approach, making it the first to achieve an MAE below 10 on RGBT-CC.

Chat is not available.