Skip to yearly menu bar Skip to main content


Tracking Multiple Deformable Objects in Egocentric Videos

Mingzhen Huang · Xiaoxing Li · Jun Hu · Honghong Peng · Siwei Lyu

West Building Exhibit Halls ABC 139


Most existing multiple object tracking (MOT) methods that solely rely on appearance features struggle in tracking highly deformable objects. Other MOT methods that use motion clues to associate identities across frames have difficulty handling egocentric videos effectively or efficiently. In this work, we propose DETracker, a new MOT method that jointly detects and tracks deformable objects in egocentric videos. DETracker uses three novel modules, namely the motion disentanglement network (MDN), the patch association network (PAN) and the patch memory network (PMN), to explicitly tackle the difficulties caused by severe ego motion and fast morphing target objects. DETracker is end-to-end trainable and achieves near real-time speed. We also present DogThruGlasses, a large-scale deformable multi-object tracking dataset, with 150 videos and 73K annotated frames, collected by smart glasses. DETracker outperforms existing state-of-the-art method on the DogThruGlasses dataset and YouTube-Hand dataset.

Chat is not available.