Multi-Modal22 [2025-2] 백승우 - MAS-Bench: A Unified Benchmark for Shortcut-Augmented Hybrid Mobile GUI Agents MAS-Bench: A Unified Benchmark for Shortcut-Augmented Hybrid Mobile GUI AgentsTo enhance the efficiency of GUI agents on various platforms like smartphones and computers, a hybrid paradigm that combines flexible GUI operations with efficient shortcuts (e.g., API, deep links) is emerging as a promising direction. However, a frameworkarxiv.org 2025. 12. 24. [2025-2] 백승우 - Toward Autonomous UI Exploration: The UIExplorer Benchmark https://arxiv.org/abs/2506.17779 2025. 12. 3. [2025-2] 백승우 - GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning GUI Exploration Lab: Enhancing Screen Navigation in Agents via...With the rapid development of Large Vision Language Models, the focus of Graphical User Interface (GUI) agent tasks shifts from single-screen tasks to complex screen navigation challenges. However...openreview.net 2025. 11. 26. [2025-2] 백승우 - UltraCUA: A Foundation Model for Computer Use Agents with Hybrid Action UltraCUA: A Foundation Model for Computer Use Agents with Hybrid ActionMultimodal agents for computer use rely exclusively on primitive actions (click, type, scroll) that require accurate visual grounding and lengthy execution chains, leading to cascading failures and performance bottlenecks. While other agents leverage richarxiv.org 2025. 10. 29. 이전 1 2 3 4 ··· 6 다음