본문 바로가기

책상 밖 세상을 경험할 수 있는 Playground를 제공하고, 수동적 학습에서 창조의 삶으로의 전환을 위한 새로운 라이프 스타일을 제시합니다.

Natural Language Processing71

[2025-2] 백승우 -Theory of Mind ReferenceMachine Theory of MindNeil C. Rabinowitz et al. (2018)https://arxiv.org/abs/1802.07740Theory of Mind May Have Spontaneously Emerged in Large Language ModelsMichal Kosinski (2023)https://arxiv.org/abs/2302.02083Theory of Mind for Multi-Agent Collaboration via Large Language ModelsHuao Li, Yu Quan Chong, Simon Stepputtis et al. (2024)https://arxiv.org/abs/2310.10701Theory of Mind in Large.. 2025. 8. 7.

[2025-2] 백승우 - ReTool: Reinforcement Learning for Strategic Tool Use in LLMs ReTool: Reinforcement Learning for Strategic Tool Use in LLMsWhile reasoning models (e.g., DeepSeek R1) trained with reinforcement learning (RL), excel in textual reasoning, they struggle in scenarios requiring structured problem-solving, such as geometric reasoning, concise computation, or complex equation solving-arxiv.org 2025. 7. 29.

[2025-2] 박지원 - QLORA QLORA: https://arxiv.org/abs/2305.14314 QLoRA: Efficient Finetuning of Quantized LLMsWe present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance. QLoRA backpropagates gradients through a frozen, 4-bit quanarxiv.org 1. 서론: 대규모 언어 모델(LLM) fine tuning 도전과 QLORA의 등.. 2025. 7. 17.

[2025-2] 박제우 - GRAPH ATTENTION NETWORKS https://arxiv.org/abs/1710.10903 Graph Attention NetworksWe present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximationsarxiv.org 1. Introduction현재까지 CNN은 grid-like structure에서 높은 성능을 발휘했고, 이는 이미지 분류나 기계 .. 2025. 7. 13.

이전 1 2 3 4 ··· 18 다음

티스토리툴바