Adaptive 3D Perception Under Sparse Sampling via Reinforcement Learning
Abstract
Detecting small aerial targets (SATs) from long-range LiDAR is challenging because point density changes dramatically with motion: fast flights produce ultra-sparse returns, while hovering or slow motion yields dense local clusters, breaking fixed-voxel and static-threshold assumptions in standard 3D detectors and trackers. We introduce A3PRL, an RL-driven adaptive perception framework that closes the loop between LiDAR sensing and tracking. A3PRL builds on a sparsity-aware proposal stage with Temporal Dispersion Signatures and velocity-change cues, and deploys a lightweight 5D policy that jointly adjusts voxel resolution, detection sensitivity, and association gating based on purely label-free statistics summarizing spatio–temporal sparsity, foreground acceptance, and tracking continuity. The policy is trained with privileged supervision from ground-truth trajectories to shape a reward that balances geometric accuracy, temporal stability, and regularized acceptance, but runs fully label-free at test time. On the public MMAUD benchmark, training on V1 and evaluating on unseen V2/V3 domains, A3PRL reduces 3D localization error by about 19\% compared to its non-RL counterpart and consistently outperforms LiDAR-only and multimodal baselines under both day and night conditions. We further show that the same policy transfers to an in-house LiDAR–RTK setup and a public multi-LiDAR SAT dataset with heterogeneous scan patterns, where it maintains accurate trajectories and stable tracks under varying sparsity, while adding less than 2 ms per frame on a 10 Hz LiDAR budget.