OS-Fed: One Snapshot Is All You Need
Xuwei Qian ⋅ Jinghui Zhang ⋅ Yuchuan Tan ⋅ Wenbo Huang ⋅ Zhen Wu ⋅ Shen Zhou ⋅ LiSha Gao ⋅ Ding Ding ⋅ Fang Dong
Abstract
Reducing communication overhead in federated learning (FL) is challenging but crucial for large-scale distributed privacy-preserving machine learning. Unfortunately, directly compressing model updates often leads to sub-optimal convergence due to information loss, while increasing local computation can cause model divergence. Hence, this paper proposes a drastically different approach that adheres to the maxim that ``a picture is worth a thousand words''. We observe that the entire gradient information from local training can be effectively reconstructed from a compact, image-like representation. Based on this observation, we propose a novel approach, OS-Fed, which performs One-Shot Federated Learning by transmitting only a single, compact snapshot (comprising an image and a set of learnable labels) per round. To realize this approach, OS-Fed presents new snapshot synthesis techniques to (1) target the accumulated update of a trajectory segment to tackle gradient noise, (2) design a multi-grid snapshot that decouples conflicting gradient directions, and (3) incorporate error compensation to maintain training stability under extreme compression. Extensive experiments on CV and NLP benchmarks show that OS-Fed reduces communication costs by 1.5-16$\times$ compared to state-of-the-art algorithms , resulting in 18-45\% faster convergence.
Successful Page Load