Poster Sat, Jun 6, 2026 • 3:45 PM – 5:45 PM PDT ExHall A & F 568

OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models

Zhenyu Wu ⋅ JingJing Xie ⋅ Zehao Li ⋅ Bowen Yang ⋅ Qiushi Sun ⋅ Zhaoyang Liu ⋅ Zhoumianze Liu ⋅ Yu Qiao ⋅ Xiangyu Yue ⋅ Zun Wang ⋅ Zichen Ding

Abstract

The deployment of autonomous agents in Graphical User Interface (GUI) environments confronts significant challenges, notably error accumulation in long-horizon tasks and the severe consequences of irreversible operations. While critic models that provide real-time action assessment offer a promising solution, their effectiveness is hindered by the lack of diverse, high-quality GUI feedback data and public critic benchmarks for computer use.To bridge these gaps,we introduce OS-Oracle that makes three core contributions:(1) a scalable data pipeline for synthesizing cross-platform GUI critic data;(2) a two-stage training paradigm combining supervised fine-tuning (SFT) and consistency-preserving group relative policy optimization (CP-GRPO); (3) OS-Critic Bench, a holistic benchmark for evaluating critic model performance across Mobile, Web, and Desktop platforms.Leveraging this framework, we curate a high-quality dataset containing 310k critic samples. The resulting critic model, OS-Oracle-7B, achieves impressive performance,and further reduces error rates, which improves the capability of GUI agents in dynamic environments. All codes, data and checkpoints will be made public.