Spike-driven Discrete Aggregation for Event-based Object Detection
Huaning Li ⋅ Ziming Wang ⋅ Runhao Jiang ⋅ Rui Yan ⋅ Huajin Tang
Abstract
With their high dynamic range and temporal resolution, event cameras are well-suited for object detection, especially under motion blur and extreme illumination.Recent state-of-the-art works for event-based object detection primarily focus on the high-level design of backbones. However, developing effective event representations is equally crucial, as it bridges asynchronous event streams with the dense tensors required by detection networks. Most existing aggregation strategies for event representation continuously accumulate all events within sampled intervals without selective filtering, inevitably introducing uninformative events that degrade detection accuracy.To address this limitation, we introduce a novel perspective, termed Discrete Aggregation, which adaptively and discretely selects informative events for differentiable aggregation. We realize this through the Spiking Discrete Aggregation (SDA) module, which is inspired by the threshold-based spike firing mechanism in Spiking Neural Networks (SNNs) and implemented using gated recurrent spiking neurons.Additionally, we introduce the Multi-Timescale Fusion (MTF) method which leverages coarse-grained temporal features from continuous event streams to further enhance the representation capability of SDA. Experimental results on neuromorphic datasets demonstrate that our method achieves state-of-the-art performance among all fully spiking architectures while using fewer parameters, reaching 43.4\% $mAP_{50:95}$ on Gen1 (+ 4.5\% over prior art). Moreover, our method exhibits superior robustness under noisy conditions and shows strong compatibility with non-spiking models.
Successful Page Load