Skip to yearly menu bar Skip to main content


Paper
in
Workshop: Workshop on Foundation and Large Vision Models in Remote Sensing

Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation

Thomas Kerdreux · Alexandre Tuel · Alexis Mouche · Quentin Febvre · Bertrand Chapron


Abstract:

Self-supervised learning (SSL) has enabled the development of vision foundation models for Earth Observation (EO), demonstrating strong transferability across diverse remote sensing tasks. While much research has focused on network architectures and training strategies, the role of dataset curation -- particularly in balancing and diversifying pre-training datasets -- remains underexplored. In EO, this challenge is exacerbated by the strong redundancy and heavy-tailed distributions of satellite imagery, which can lead to biased representations and inefficient training. In this work, we introduce a dynamic dataset pruning strategy designed to enhance SSL pre-training efficiency by maximizing dataset diversity and balancedness. Our method iteratively refines the training set without relying on a pre-existing feature extractor, making it well-suited for domains where curated datasets are unavailable. We illustrate our approach on the Sentinel-1 Wave Mode (WV) Synthetic Aperture Radar (SAR) archive, a challenging dataset primarily composed of ocean observations. We train models from scratch on the entire Sentinel-1 WV data archive over 10 years. Our results, validated across three downstream tasks, show that dynamic pruning improves both computational efficiency and feature quality, leading to better transferability in real-world applications. This work provides a scalable and adaptable solution for dataset curation in EO, paving the way for more efficient and generalizable foundation models in remote sensing. We release the weights of Nereus-SAR-1, the first foundation model in our Nereus models family -- series of models dedicated to ocean observation and analysis using SAR imagery, at github.com/galeio-research/nereus-sar-models/.

Chat is not available.