Hist2Style: Histogram-Guided Stylization with Bilateral Grids
Abstract
Photorealistic style transfer aims to match the color and tone of an input image to that of a style target while preserving the content and details of the original scene. Although existing large image models can facilitate these kinds of appearance edits, their high computational demands, potential for hallucinations, and limited user control make them unsuitable for high-resolution, real-time workflows. We introduce Hist2Style, a bilateral-grid formulation for fast, edge-aware stylization that preserves visual fidelity by constraining operations to locally affine transforms in bilateral space. Our model is trained to reproduce the spatially varying color edits available in larger image editing models. This training paradigm involves generating a large supervised corpus with language and vision-language models and distilling a high-capacity editor into a lightweight model. The model conditions on a histogram-based embedding of the style target, which provides an interpretable interface for adjusting the output style by modifying the target color distribution. Overall, Hist2Style maintains content structure by construction, avoids hallucinations, and supports real-time, high-resolution photorealistic stylization with interactive user-controllable color and tone adjustments.