RAVEN: Radar Adaptive Vision Encoders for Efficient Chirp-wise Object Detection and Segmentation
Abstract
We introduce RAVEN, a deep learning architecture for processing frequency-modulated continuous-wave (FMCW) radar data that is designed for high computational efficiency. RAVEN reduces computation by using a learnable antenna mixer module on independent receiver state space encoders (SSM) to compress the virtual MIMO array into a compact set of learned features and by performing per-chirp inference with a calibrated early-exit rule, so the model reaches a decision using only a subset of chirps in a radar frame. These design choices yield up to 170× lower computation and 4× lower end-to-end latency than conventional frame-based radar backbones, while achieving state-of-the-art detection and BEV free-space segmentation performance on automotive radar datasets.