Multi-Modal16 [2025-2] 박제우 - ANOMALYCLIP: OBJECT-AGNOSTIC PROMPT LEARNING FOR ZERO-SHOT ANOMALY DETECTION https://arxiv.org/abs/2310.18961 AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly DetectionZero-shot anomaly detection (ZSAD) requires detection models trained using auxiliary data to detect anomalies without any training sample in a target dataset. It is a crucial task when training data is not accessible due to various concerns, eg, data privaarxiv.org 0. Abstract제로샷 이상탐지(ZS.. 2025. 9. 27. [2025-2] 백승우 - Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents Scalable Video-to-Dataset Generation for Cross-Platform Mobile AgentsRecent advancements in Large Language Models (LLMs) and Vision-Language Models (VLMs) have sparked significant interest in developing GUI visual agents. We introduce MONDAY (Mobile OS Navigation Task Dataset for Agents from YouTube), a large-scale datasetarxiv.org 2025. 8. 20. [2025-2] 백승우 - UI-TARS: Pioneering Automated GUI Interaction with Native Agents UI-TARS: Pioneering Automated GUI Interaction with Native AgentsThis paper introduces UI-TARS, a native GUI agent model that solely perceives the screenshots as input and performs human-like interactions (e.g., keyboard and mouse operations). Unlike prevailing agent frameworks that depend on heavily wrapped commercialarxiv.org 2025. 7. 30. [2025-1] 백승우 - GUI Agent by Script-based Automation 2025. 7. 4. 이전 1 2 3 4 다음