본문 바로가기
  • 책상 밖 세상을 경험할 수 있는 Playground를 제공하고, 수동적 학습에서 창조의 삶으로의 전환을 위한 새로운 라이프 스타일을 제시합니다.

NLP111

[2026-1] 염제원, 김학선 - AA-Omniscience: Evaluating Cross-Domain KnowledgeReliability in Large Language Models AA-Omniscience: Evaluating Cross-Domain Knowledge Reliability in Large Language ModelsExisting language model evaluations primarily measure general capabilities, yet reliable use of these models across a range of domains demands factual accuracy and recognition of knowledge gaps. We introduce AA-Omniscience, a benchmark designed to measurearxiv.org ArtificialAnalysis/AA-Omniscience-Public · Data.. 2026. 2. 16.
[2026-1] 정재훈 - AnEmpirical Evaluation of Geeric Convolutional and Recurrent Networksfor Sequence Modeling 더보기https://arxiv.org/abs/1803.01271 An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence ModelingFor most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation. Given aarxiv.org 0.BE.. 2026. 2. 7.
[2026-1] 백승우 - Self-Improving Pretraining:using post-trained models to pretrain better models Self-Improving Pretraining: using post-trained models to pretrain better modelsEnsuring safety, factuality and overall quality in the generations of large language models is a critical challenge, especially as these models are increasingly deployed in real-world applications. The prevailing approach to addressing these issues involvearxiv.org 2026. 2. 4.
[2026-1] 백승우 - UICOMPASS: UI Map Guided Mobile Task Automation via Adaptive Action Generation UICOMPASS: UI Map Guided Mobile Task Automation via Adaptive Action GenerationYuanzhang Lin, Zhe Zhang, He Rui, Qingao Dong, Mingyi Zhou, Jing Zhang, Xiang Gao, Hailong Sun. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025.aclanthology.org 2026. 1. 28.