OrthoFuse: Training-free Riemannian Fusion of Orthogonal Style-Concept Adapters for Diffusion Models
Ali Aliev ⋅ Kamil Garifullin ⋅ Nikolay Yudin ⋅ Vera Soboleva ⋅ Alexander Molozhavenko ⋅ Ivan Oseledets ⋅ Aibek Alanov ⋅ Maxim Rakhuba
Abstract
In a rapidly growing field of model training there is a constant practical interest in parameter-efficient model fine-tuning and various techniques that use a small amount of training data to adapt the model to a narrow task. Despite the efficiency of LoRA, one of the most popular fine-tuning methods nowadays, there is an open question: how to combine several adapters tuned for different tasks in one which is able to yield adequate results on both tasks? Specifically, merging subject and style adapters for generative models remains unresolved. In this paper we seek to show that in the case of orthogonal fine-tuning (OFT), we can use structured orthogonal parametrization and, utilizing manifold theory, get the formulas for training-free adapter merging. In particular, we derive the structure of the manifold formed by $\mathcal{GS}$ orthogonal matrices, and obtain efficient formulas for the geodesics approximation between two points. We identify that naive geodesic merging compresses spectral distributions, reducing expressiveness; our Cayley transform correction restores spectral properties for higher-quality fusion. We conduct experiments in subject-driven generation tasks showing that our technique to merge two $\mathcal{GS}$ orthogonal matrices is capable to unite concept and style features of different adapters. To our knowledge, this is the first training-free method for merging multiplicative orthogonal adapters.
Successful Page Load