VisiLock: Authorizing Instruction-based Image editing with Dual Score Distillation
Abstract
While open-sourcing instruction-guided image editing models accelerates research, it surrenders control over their capabilities to anyone who downloads the weights.Existing protection methods are reactive: they verify ownership after generation, but the underlying model remains fully functional for unauthorized users.We introduce \textbf{Visilock}, where access control is baked into model weights, rendering the model unusable without a visual trigger in the input.The challenge is training a model that retains editing capability for authorized input and remains unusable for unauthorized input, without destabilizing training.Naive multi-task objectives create gradient conflicts that collapse training, while contrastive approaches like FMLock destroy the denoising manifold.We develop \textbf{Diverged Score Distillation}, a dual-teacher framework where a degraded teacher defines locked behavior and an original teacher guides editing quality, eliminating gradient interference through separate frozen targets.A key risk is that released models could be unlocked through post-hoc fine-tuning. To prevent this, we initialize the student model from the degraded teacher so that it begins in a locked state, and only regains editing ability for authorized inputs via distillation. This impedes adversarial fine-tuning from recovering full editing capability.Evaluation on InstructPix2Pix shows authorized edits maintain baseline quality (CLIP-I: 0.821, DINO: 0.726) while unauthorized attempts degrade substantially (CLIP-I: 0.481, DINO: 0.072) with 41\% and 90\% drops in image and semantic similarity.The lock remains robust to key corruptions, spatial perturbations, and adversarial unlock fine-tuning.Code and data will be available for research purposes.