Occlusion-Aware SORT: Observing Occlusion for Robust Multi-Object Tracking
Abstract
Multi-object tracking (MOT) in computer vision involves analyzing object trajectories and counting the number of objects in video sequences. However, 2D MOT faces challenges due to cost confusion arising from partial occlusion. To address this issue, we present the Occlusion-Aware SORT (OA-SORT) framework, which introduces three innovations: the Occlusion-Aware Module (OAM), the Occlusion-Aware Offset (OAO), and the Bias-Aware Momentum (BAM). First, OAM assesses the occlusion status (\ie, occlusion severity) of objects and introduces a Gaussian Map (GM) to reduce background influence. Two plug-and-play, training-free components—OAO and BAM—are further proposed. Specifically, OAO leverages the OAM-derived bias from the Kalman Filter's position estimations to compensate positional cost, thereby mitigating confusion. Next, BAM utilizes the OAM-derived bias from the latest trajectory observations to optimize the Kalman Filter’s motion parameters, suppressing estimation fluctuations. Comprehensive evaluations on the DanceTrack, SportsMOT, and MOT17 datasets demonstrate the importance of occlusion handling in MOT. On the DanceTrack test set, OA-SORT achieves 63.1\% and 64.2\% in HOTA and IDF1, respectively. Furthermore, integrating the Occlusion-Aware framework into the four additional trackers improves HOTA and IDF1 by an average of 2.08\% and 3.05\% on DanceTrack, demonstrating the reusability of the occlusion-aware framework and its components. Ablation studies further validate the effectiveness of the three components, highlighting the key role of the Gaussian Map.