LiDAR-to-4DRadar Diffusion Bridge via Cross-Modal Alignment and Translation in Latent Space
Abstract
Millimeter-wave radar’s all-weather capability makes it increasingly vital for autonomous perception. However, the high cost of radar data collection drives the need for data generation to augment radar datasets. Existing works mainly target partial radar representations, e.g., 2D or 3D slices, leading to information loss and limited downstream performance. To overcome these issues, we introduce the novel task of LiDAR-to-4DRadar translation, which generates complete 4D radar tensors, with three spatial and one Doppler axes, guided by LiDAR data that preserve spatial and semantic consistency. We propose a novel diffusion bridge model in an aligned LiDAR-4DRadar latent space, namely \textbf{L2RLDB}, to tackle this task. Specifically, first, a key-voxel-aware VAE compresses high-dimensional, noisy radar tensors into a compact latent space, while enabling precise numerical reconstruction and key-voxel identification. Second, to bridge the cross-modal gap between sparse 3D LiDAR and dense 4D radar, we develop a patch-wise contrastive learning module to align LiDAR latents with radar semantically and spatially. Finally, we formulate the translation as a diffusion bridge process between LiDAR and radar latents, enabling the synthesis of full radar tensors from Doppler-lacking LiDAR inputs. Experiments verify that L2RLDB achieves high-fidelity 4D radar generation and significantly improves downstream detection through data augmentation.