Unlearning without Forgetting: Securely Removing Targeted Concepts from Large-Scale Vision-Language Open-Vocabulary Detectors
Zhongze Wu ⋅ Xiu Su ⋅ Feng Yang ⋅ Dan Niu ⋅ Shan You ⋅ Yueyi Luo ⋅ Jun Long
Abstract
Open-vocabulary detectors (OvOD) inherit tightly coupled cross-modal knowledge from web-scale pretraining, creating privacy, copyright, and compliance risks. Existing machine unlearning methods face \emph{geometric entanglement interference} in OvOD: forgetting updates inevitably distort preserved knowledge due to shared semantic factors in decomposable embeddings. We introduce \textbf{SafeDetect}, a geometrically constrained unlearning framework that constructs a null-space from preserved knowledge embeddings offline, then constrains parameter updates to this orthogonal complement, mathematically preventing interference with retained concepts. Forgetting is achieved through a one-step mean-flow objective that drives forgotten concepts toward non-detectable, while multimodal decoupling prevents cross-modal recovery. We establish UOD-Bench, the first unified benchmark for OvOD unlearning, featuring 14.7K images with 67.3K region-phrase pairs across three tasks. Extensive experiments across UOD-Bench and standard benchmarks with diverse architectures (\textit{e.g.}, GroundingDINO, LLM-Det) demonstrate that SafeDetect achieves superior forgetting efficacy (64.75\% improvement over NPO) while maintaining stable retention performance and significantly better zero-shot generalization, with 1.5$\times$ faster convergence than iterative methods. Code and benchmark will be released.
Successful Page Load