Poster
DeGaze: Deformable and Decoupled Representation Learning for 3D Gaze Estimation
Yunfeng Xiao · Xiaowei Bai · Baojun Chen · Hao Su · Hao He · Liang Xie · Erwei Yin
3D Gaze estimation is a challenging task due to two main issues. First, existing methods focus on analyzing dense features (e.g., large areas of pixels), which are sensitive to local noise (e.g., light spots, blurs) and increase computational complexity. Second, an eyeball model can exhibit multiple gaze directions, and the coupled representation between gazes and models increases learning difficulty.To address these issues, we propose \textbf{De\textsuperscript{2}Gaze}, a lightweight and accurate model-aware 3D gaze estimation method. In De\textsuperscript{2}Gaze, we introduce two improvements for deformable and decoupled representation learning.Specifically, first, we propose a deformable sparse attention mechanism that can adapt sparse sampling points to attention areas to avoid local noise influences. Second, a spatial decoupling network with a double-branch decoding architecture is proposed to disentangle the invariant (e.g., eyeball radius, position) and variable (e.g., gaze, pupil, iris) features from latent space.Compared to existing methods, De\textsuperscript{2}Gaze only requires fewer sparse features, and achieves faster convergence speed, lower computational complexity, and higher accuracy in 3D gaze estimation.Qualitative and quantitative experiments show that De\textsuperscript{2}Gaze achieves state-of-the-art accuracy and semantic segmentation for 3D gaze estimation on the TEyeD dataset.
Live content is unavailable. Log in and register to view live content