Skip to yearly menu bar Skip to main content


Poster

CocoER: Aligning Multi-Level Feature by Competition and Coordination for Emotion Recognition

Xuli Shen · Hua Cai · Weilin Shen · Qing Xu · Dingding Yu · Weifeng Ge · Xiangyang Xue


Abstract:

With the explosion of human-machine interaction, emotion recognition has reignited attention. Previous works focus on improving visual feature fusion and reasoning from multiple image levels. Although it is non-trivial to deduce a person's emotion by integrating multi-level feature (head, body and context), the emotion recognition results of each level is usually different from one another, which creates inconsistency in the prevailing feature alignment method and decrease recognition performance. In this work, we propose a multi-level image feature refinement method for emotion recognition (CocoER) to mitigate the impact caused by conflicting results from multi-level recognition. First, we leverage cross-level attention to improve visual feature consistency between hierarchically cropped head, body and context windows. Then, vocabulary informed alignment is incorporated into the recognition framework to produce pseudo label and guide hierarchical visual feature refinement. To effectively fuse multi-level feature, we elaborate on a competition process of eliminating irrelevant image level predictions and a coordination process to enhance the feature across all levels. Extensive experiments are executed on two popular datasets, and our method achieves state-of-the-art performance with multi-level interpretation results.

Live content is unavailable. Log in and register to view live content