MMVIP: A Visible-infrared Paired Dataset for Multi-weather Marine Vision
Abstract
Maritime multimodal vision faces significant challenges due to the complexity and variability of oceanic weather and environmental conditions. While modern vessels are commonly equipped with visible and infrared imaging systems, the complementary nature of these modalities fundamentally depends on accurate cross-modal registration. However, the absence of paired visible–infrared datasets that realistically capture diverse maritime scenarios has severely hindered progress in this field. To overcome this limitation, we present MMVIP, the first large-scale visible–infrared maritime vision dataset covering a wide spectrum of weather conditions and sea states. The dataset contains 128,100 images and 50 video sequences with precise spatial–temporal alignment. Comprehensive evaluations across image registration, fusion, maritime object detection, and cross-modal image translation tasks demonstrate the dataset’s effectiveness and challenge. Furthermore, MMVIP establishes a new benchmark for advancing multimodal maritime perception. The dataset and corresponding benchmarks are publicly available.