Poster
AutoPresent: Designing Structured Visuals from Scratch
Jiaxin Ge · Zora Zhiruo Wang · Xuhui Zhou · Yi-Hao Peng · Sanjay Subramanian · Qinyue Tan · Maarten Sap · Alane Suhr · Daniel Fried · Graham Neubig · Trevor Darrell
ExHall D Poster #262
Designing structured visuals such as presentation slides is essential for communicative needs, necessitating both content creation and visual planning skills to deliver insights. In this work, we study automated slide generation, where models produce slide presentations from natural language (NL) instructions of varying specificity. We first introduce our SLIDESBENCH benchmark, with 585 examples derived from slide decks across 10 domains. SLIDESBENCH supports evaluations that are both (i) reference-based to measure similarity to a target slide, and (ii) reference-free to measure the design quality of generated slides alone. We benchmark end-to-end image generation and program generation methods with varied models, and find that programmatic methods produce higher-quality slides in user-interactable formats. Built on the success of program generation, we create PRESENTER, an 8B LLAMA slide generation model trained on 7k (NL, slide code) pairs, and achieve results comparable to the strongest model GPT-4O. Beyond one-pass slide generation, we explore iterative design refinement with model self-critiques and effectively improve element layout in slides. We hope that our work could provide a basis for future work on generating structured visuals.
Live content is unavailable. Log in and register to view live content