전체 글354 [2025-2] 백승우 - GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning GUI Exploration Lab: Enhancing Screen Navigation in Agents via...With the rapid development of Large Vision Language Models, the focus of Graphical User Interface (GUI) agent tasks shifts from single-screen tasks to complex screen navigation challenges. However...openreview.net 2025. 11. 26. [2025-2] 최민서 - Direct Preference Optimization:Your Language Model is Secretly a Reward Model [논문링크] https://arxiv.org/abs/2305.18290 Direct Preference Optimization: Your Language Model is Secretly a Reward ModelWhile large-scale unsupervised language models (LMs) learn broad world knowledge and some reasoning skills, achieving precise control of their behavior is difficult due to the completely unsupervised nature of their training. Existing methods for gaining sarxiv.org 1. Introductio.. 2025. 11. 19. [2025-2] 이루가 - "Why Should I Trust You?": Explaining the Predictions of Any Classifier 논문 링크: https://arxiv.org/abs/1602.04938 "Why Should I Trust You?": Explaining the Predictions of Any ClassifierDespite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when charxiv.org 1. Introduction머신러닝 발.. 2025. 11. 8. [2025-2] 정유림 - Quantifying Attention Flow in Transformers 논문 개요논문 제목: Quantifying Attention Flow in Transformers게재 연도: 2020 (arXiv:2005.00928)인용 횟수: 2025.11.08 기준 1331회 인용논문 배경 : Attention 시각화 = 설명일까?Self-Attention은 각 토큰이 다른 토큰을 얼마나 참조하는지를 수치화하니, 사람들은 attention heatmap을 곧잘 explanation처럼 사용했었음.하지만 Transformer는 레이어를 거치며 정보가 contextualization + mixing되고, residual connection과 FFN을 통해 정보가 우회/축적됨. 그래서 높은 레이어의 raw attention은 종종 uniform(평평)해지고, 토큰 기여도를 직관적으로 읽.. 2025. 11. 8. 이전 1 2 3 4 ··· 89 다음