Skip to yearly menu bar Skip to main content


Poster

DreamTrack: Dreaming the Future for Multimodal Visual Object Tracking

Mingzhe Guo · Weiping Tan · Wenyu Ran · Liping Jing · Zhipeng Zhang


Abstract:

Aiming to achieve class-agnostic perception in visual object tracking, current trackers commonly formulate tracking as a one-shot detection problem with the template-matching architecture. Despite the success, severe environmental variations in long-term tracking raise challenges to generalizing the tracker in novel situations. Temporal trackers try to fix it by preserving the time-validity of target information with historical predictions, e.g., updating the template. However, solely transmitting the previous observations instead of learning from them leads to an inferior capability of understanding the tracking scenario from past experience, which is critical for the generalization in new frames. To address this issue, we reformulate temporal learning in visual tracking as a History-to-Future process and propose a novel tracking framework DreamTrack. Our DreamTrack learns the temporal dynamics from past observations to dream the future variations of the environment, which boosts the generalization with the extended future information from history. Considering the uncertainty of future variation, multimodal prediction is designed to infer the target trajectory of each possible future situation. Without bells and whistles, DreamTrack achieves leading performance with real-time inference speed. In particular, DreamTrack obtains SUC scores of 76.6%/87.9% on LaSOT/TrackingNet, surpassing all recent SOTA trackers.

Live content is unavailable. Log in and register to view live content