Poster
STEPS: Sequential Probability Tensor Estimation for Text-to-Image Hard Prompt Search
Yuning Qiu · Andong Wang · Chao Li · Haonan Huang · Guoxu Zhou · Qibin Zhao
Recent text-to-image (T2I) diffusion models have demonstrated remarkable capabilities in visual synthesis, yet their performance heavily relies on the quality of input prompts. However, optimizing discrete prompts remains challenging because the discrete nature of tokens prevents the direct application of the gradient descent method and the vast search space of possible token combinations. As a result, existing approaches either suffer from quantization errors when employing continuous optimization techniques or become trapped in local optima due to coordinate-wise greedy search. In this paper, we propose STEPS, a novel Sequential probability Tensor Estimation approach for hard Prompt Search. Our method reformulates discrete prompt optimization as a sequential probability tensor estimation problem, leveraging the inherent low-rank characteristics to address the curse of dimensionality. To further improve the computational efficiency, we develop a memory-bounded sampling approach that shrinks the sequential probability without the iteration step dependency while preserving sequential optimization dynamics. Extensive experiments on various public datasets demonstrate that our method consistently outperforms existing approaches in T2I generation, cross-model prompt transferability, and harmful prompt optimization, validating the effectiveness of the proposed framework.
Live content is unavailable. Log in and register to view live content