Paper
in
Workshop: The 6th International Workshop and Prize Challenge on Agriculture-Vision: Challenges & Opportunities for Computer Vision in Agriculture in conjunction with IEEE CVPR 2025

Agri-FM+: A Self-Supervised Foundation Model for Agricultural Vision

Md Jaber Al Nahian · Tapotosh Ghosh · Farnaz Sheikhi · Farhad Maleki

[ Slides]

Abstract

Foundation models have revolutionized computer vision, yet their adoption in precision agriculture remains limited due to significant domain shifts from natural images. Existing agricultural foundation models focus primarily on remote sensing applications; to date, no dedicated foundation model exists for close-field agricultural vision. In this paper, we propose Agri-FM+, a self-supervised foundation model specifically tailored for agricultural vision, trained via a two-stage continual learning pipeline. Starting from publicly available unsupervised ImageNet weights from the SlotCon, Agri-FM+ is continually adapted on a curated 147K-image agricultural dataset using SlotCon. Evaluated across eight diverse benchmarks---covering object detection, semantic segmentation, and instance segmentation tasks---Agri-FM+ consistently outperforms both ImageNet‑pretrained and randomly initialized models. Under full supervision, it achieves average gains of +1.27% over supervised ImageNet-pretrained and +8.25% over random initialization. Even when trained with only 10% of the annotated data, Agri-FM+ maintains robust performance, achieving gains of +1.02% and +4.54% over supervised ImageNet pretraining and random initialization, respectively. These results demonstrate the ability of Agri-FM+ to provide domain‑adapted, label-efficient representations that scale effectively across real‑world agricultural vision tasks. The code, weights, and more details will be made available at: https://github.com/FarhadMaleki/AgriFMPlus.

Chat is not available.