OrionEdit: Bridging Reference and Source Images for Generalized Cross-Image Editing
Abstract
Multimodal image synthesis has achieved remarkable progress in producing visually coherent results, yet most editing methods still rely on semantic instructions, which is less direct than using visual guidance.Recently, a new paradigm has emerged that focuses on "editing one image from another", enabling more direct and interpretable manipulation through reference exemplars. In this work, we formalize this paradigm as cross-image editing, which modifies a source image under the guidance of one or more references, encompassing subject replacement, style transfer, image completion, and other reference-to-source tasks. To address this, we introduce OrionEdit, a unified framework that regulates visual attribute transfer through two key mechanisms: (1) A symmetric orthogonal subspace update that partitions image features into branch-specific subspaces, mitigating feature entanglement and preserving subject identity; and (2) a reverse-causal attention mechanism with an information-flow mask that enforces unidirectional dependencies in the latent space. Built on standard diffusion backbones, OrionEdit enables zero-shot editing with multiple references and yields consistent gains over open-source baselines, rivaling proprietary models in fidelity and disentanglement.