Skip to yearly menu bar Skip to main content


Prompt Augmentation for Self-supervised Text-guided Image Manipulation

Rumeysa Bodur · Binod Bhattarai · Tae-Kyun Kim

Arch 4A-E Poster #391
[ ]
Wed 19 Jun 5 p.m. PDT — 6:30 p.m. PDT


Text-guided image editing finds applications in various creative and practical fields. While recent strides in image generation have advanced the field, they often grapple with the dual challenges of coherent image transformation and context preservation. In response, our work introduces prompt augmentation, a method amplifying a single input prompt into a palette of target prompts, strengthening textual context and enabling localised image editing. Specifically, we utilise the augmented prompts to delineate the intended manipulation area. We propose a Contrastive Loss tailored to driving effective image editing by displacing edited areas and drawing preserved regions closer. Acknowledging the continuous nature of image manipulations, we further refine our approach by incorporating the similarity concept, creating a Soft Contrastive Loss. The new losses are incorporated to the diffusion model in an end-to-end manner, demonstrating improved image editing results on public datasets and generated images over the baseline and competitive results against state-of-the-art approaches.

Live content is unavailable. Log in and register to view live content