Skip to yearly menu bar Skip to main content


Poster

FIFA: Fine-grained Inter-frame Attention for Driver's Video Gaze Estimation

Daosong Hu · Mingyue Cui · Kai Huang


Abstract:

Gaze direction serves as a pivotal indicator for assessing the level of driver attention. While image-based gaze estimation has been extensively researched, there has been a recent shift towards capturing gaze direction from video sequences. This approach encounters notable challenges, including the comprehension of the dynamic pupil evolution across frames and the extraction of head pose information from a relatively static background. To surmount these challenges, we introduce a dual-stream deep learning framework that explicitly models the displacement changes of the pupil through a fine-grained inter-frame attention mechanism and generates weights to adjust gaze embeddings. This technique transforms the face into a set of distinct patches and employs cross-attention to ascertain the correlation between pixel displacements in various patches and adjacent frames, thereby tracking spatial dynamics within the sequence. Our method is validated using two publicly available driver gaze datasets, and the results indicate that it achieves state-of-the-art performance or is on par with the best outcomes while reducing the parameters.

Live content is unavailable. Log in and register to view live content