Skip to yearly menu bar Skip to main content


Poster

Temporally Consistent Unbalanced Optimal Transport for Unsupervised Action Segmentation

Ming Xu · Stephen Gould

[ ]
 
Oral presentation:

Abstract:

We propose a novel approach to the action segmentation task for long, untrimmed videos, based on solving an optimal transport problem. By encoding a temporal smoothness prior into a Gromov-Wasserstein problem, we are able to produce temporally consistent segmentations from noisy frame embeddings. Unlike previous approaches, our method does not require knowing the action order for a video to attain temporal consistency. Furthermore, our resulting (fused) Gromov-Wasserstein problem can be efficiently solved on GPUs using a Sinkhorn-style scaling algorithm. We demonstrate the effectiveness of our method in an unsupervised learning setting, where our method is used to generate pseudo-labels for self-training. We evaluate our segmentation approach and unsupervised learning pipeline on the Breakfast, 50-Salads, YouTube Instructions and Desktop Assembly datasets, yielding state-of-the-art results for the unsupervised video action segmentation task.

Live content is unavailable. Log in and register to view live content