Multi-Modal [2025-2] 백승우 - UI-TARS: Pioneering Automated GUI Interaction with Native Agents BaekDaBang 2025. 7. 30. 13:29 UI-TARS: Pioneering Automated GUI Interaction with Native Agents This paper introduces UI-TARS, a native GUI agent model that solely perceives the screenshots as input and performs human-like interactions (e.g., keyboard and mouse operations). Unlike prevailing agent frameworks that depend on heavily wrapped commercial arxiv.org