FSLoRA: Harmonizing Detection and Re-Identification via Freq-Spatial Low-Rank Adapter for One-Stage Person Search
Abstract
Person search, which aims to detect and re-identify individuals in unconstrained scenes, faces an inherent conflict in one-stage models: pedestrian detection focuses on shared human features, while person re-identification requires identity-specific representations. Existing approaches, such as feature decoupling and loss re-weighting, primarily address this issue in later network stages but fail to resolve early-stage feature entanglement. To overcome this limitation, we propose FSLoRA, a Freq-Spatial Low-Rank Adapter that progressively decouples task-specific features at the backbone level. FSLoRA consists of a Spatial-Level Module (SLM), which employs LoRA and a mixture-of-experts to dynamically activate task-relevant spatial features, and a Frequency-Level Module (FLM), which transforms features into the frequency domain to selectively enhance task-relevant frequency components while suppressing task-irrelevant noise. By integrating both spatial and frequency-based adaptations, FSLoRA reduces feature interference, enabling more effective joint optimization. Extensive experiments on CUHK-SYSU, PRW, and Posetrack21 demonstrate that FSLoRA not only achieves state-of-the-art performance but also serves as a plug-and-play module adaptable to various person search frameworks, offering a unified and generalizable solution for one-stage person search.