Agentic Reward Modeling: Verifying GUI Agent via Online Proactive Interaction
Reinforcement learning with verifiable rewards (RLVR) is pivotal for the continuous evolution of GUI agents, yet existing evaluation paradigms face significant limitations. Rule-based methods suffer from poor scalability and cannot handle open-ended tasks,
arxiv.org