FlashIn: Fast and Accurate Image Inversion for Real-time Image Editing
Abstract
Given an image and a descriptive prompt, image inversion seeks to identify the initial noise that, when denoised, accurately reconstructs the original image. This is crucial for applications like image editing, which can be achieved by denoising the inverted noise with an edited prompt. Existing methods often rely on approximations. They often require many steps, leading to inaccuracies, slow processing, and artifacts due to the inherent intractability in the inversion process. To overcome these issues, in this work, we propose FlashIn, a novel algorithm for faster and more accurate image inversion, enabling high-quality, real-time editing. FlashIn offers two main contributions: i) A learnable neural network directly maps an image to its corresponding noise. Trained with a cycle-consistent strategy using generated data and seed noise, this approach yields a more efficient and precise inversion model. ii) Adversarial training aligns noise-reconstructed images with real ones, enhancing inversion accuracy and editing quality. These strategies enable a fast, accurate inversion process in a single step, with further improvements possible through additional steps. Integrated with few-step diffusion models such as Flux.1-Schnell, our method achieves high-quality image editing within one second on a single A100 GPU, facilitating real-time, interactive editing. Extensive experiments demonstrate that FlashIn delivers state-of-the-art inversion precision and impressive editing results across various scenarios and applications.