Poster

SNIDA: Unlocking Few-Shot Object Detection with Non-linear Semantic Decoupling Augmentation

Yanjie Wang ⋅ Xu Zou ⋅ Luxin Yan ⋅ Sheng Zhong ⋅ Jiahuan Zhou

2024 Poster

Paper PDF [ Poster] [Paper PDF]

Abstract

Once only a few-shot annotated samples are available, the performance of learning-based object detection would be heavily dropped. Many few-shot object detection (FSOD) methods have been proposed to tackle this issue by adopting image-level augmentations in linear manners. Nevertheless, those handcrafted enhancements often suffer from limited diversity and lack of semantic awareness, resulting in unsatisfactory performance. To this end, we propose a Semantic-guided Non-linear Instance-level Data Augmentation method (SNIDA) for FSOD by decoupling the foreground and background to increase their diversities respectively. We design a semantic awareness enhancement strategy to separate objects from backgrounds. Concretely, masks of instances are extracted by an unsupervised semantic segmentation module. Then the diversity of samples would be improved by fusing instances into different backgrounds. Considering the shortcomings of augmenting images in a limited transformation space of existing traditional data augmentation methods, we introduce an object reconstruction enhancement module. The aim of this module is to generate sufficient diversity and non-linear training data at the instance level through a semantic-guided masked autoencoder. In this way, the potential of data can be fully exploited in various object detection scenarios. Extensive experiments on PASCAL VOC and MS-COCO demonstrate that the proposed method outperforms baselines by a large margin and achieves new state-of-the-art results under different shot settings.

Chat is not available.