Temporal Equilibrium MeanFlow: Bridging the Scale Gap for One-Step Generation
Abstract
MeanFlow is a powerful few-step generative framework that can be trained from scratch, but its performance degrades significantly when the one-step loss uses a large portion of training data. This stems from a temporal scale imbalance: gradients from different stages of generation contribute unevenly, leading to unstable optimization—evident in blurry samples and high FID scores. The core issue is a conflict between two opposing forces: terms that amplify variance over long time spans and strong constraints needed near the start of generation, which a fixed sampling strategy cannot reconcile. To resolve this, we propose Temporal Equilibrium MeanFlow (TEMF), which balances these competing demands through two simple yet effective components: (1) a temporal equilibrium weighting function that equalizes gradient influence across all time scales, and (2) a dynamic boundary scheduler that gradually shifts training focus—from stabilizing early steps to refining the full trajectory as training progresses. Without changing the model architecture, TEMF retains true one-step generation with classifier-free guidance, achieving a state-of-the-art FID of 2.62 on ImageNet 256×256—achieving the best results among diffusion- and flow-based one-step methods.