CVPR Poster Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning

Poster

Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning

Huajie Jiang · Zhengxian Li · Xiaohan Yu · Yongli Hu · Baocai Yin · Jian Yang · Yuankai Qi

ExHall D Poster #426

[ Abstract ] [ Paper PDF ]

Sat 14 Jun 3 p.m. PDT — 5 p.m. PDT

Abstract:

Generalized zero-shot learning aims to recognize both seen and unseen classes with the help of semantic information that is shared among different classes. It inevitably requires consistent visual-semantic alignment. Existing approaches fine-tune the visual backbone by seen-class data to obtain semantic-related visual features, which may cause overfitting on seen classes with a limited number of training images. This paper proposes a novel visual and semantic prompt collaboration framework, which utilizes prompt tuning techniques for efficient feature adaptation. Specifically, we design a visual prompt to integrate the visual information for discriminative feature learning and a semantic prompt to integrate the semantic formation for visual-semantic alignment. To achieve effective prompt information integration, we further design a weak prompt fusion mechanism for the shallow layers and a strong prompt fusion mechanism for the deep layers in the network. Through the collaboration of visual and semantic prompts, we can obtain discriminative semantic-related features for generalized zero-shot image recognition. Extensive experiments demonstrate that our framework consistently achieves favorable performance in both conventional zero-shot learning and generalized zero-shot learning benchmarks compared to other state-of-the-art methods.

Live content is unavailable. Log in and register to view live content