[2026-1] 백승우 - Agentic Reward Modeling: Verifying GUI Agent via Online Proactive Interaction

Agentic Reward Modeling: Verifying GUI Agent via Online Proactive Interaction

Reinforcement learning with verifiable rewards (RLVR) is pivotal for the continuous evolution of GUI agents, yet existing evaluation paradigms face significant limitations. Rule-based methods suffer from poor scalability and cannot handle open-ended tasks,

arxiv.org

'Multi-Modal' 카테고리의 다른 글

[2026-1] 백승우 - Agent+P: Guiding UI Agents via Symbolic Planning (0)	2026.05.19
[2026-1] 정재훈 - Multimodal UnsupervisedImage-to-Image Translation (0)	2026.05.16
[2026-1] 강민정, 염제원 - GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks (0)	2026.03.20
[2026-1] 백승우 - AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines (0)	2026.03.10
[2026-1] 정유림 - FiLM: Visual Reasoning with a General Conditioning Layer (0)	2026.02.21

[2026-1] 백승우 - Agentic Reward Modeling: Verifying GUI Agent via Online Proactive Interaction

'Multi-Modal' 카테고리의 다른 글

관련글

티스토리툴바