Natural Language Processing63 [2025-1] 백승우 - Data Selection for Language Models via Importance Resampling Data Selection for Language Models via Importance ResamplingSelecting a suitable pretraining dataset is crucial for both general-domain (e.g., GPT-3) and domain-specific (e.g., Codex) language models (LMs). We formalize this problem as selecting a subset of a large raw unlabeled dataset to match a desired target diarxiv.org1. MethodDSIR FrameworkLarge raw dataset에서 target data의 distribution과 일치하.. 2025. 3. 3. [2025-1] 김지원 - Mamba: Linear-Time Sequence Modeling with Selective State Spaces Mamba: Linear-Time Sequence Modeling with Selective State Spaces (2023)인용수: 2256 (25.02.23 기준)논문 링크 : https://arxiv.org/pdf/2312.00752https://blog.outta.ai/169 [2025-1] 김지원 - Efficiently Modeling Long Sequences with Structured State Spaces논문 링크 Efficiently Modeling Long Sequences with Structured State Spaces특징 : ICRL 2022 Outstanding Paper, 인용 수 1578회 (2025-01-25 기준)코드: https://github.com/state-.. 2025. 2. 23. [2025-1] 김학선 - Code Security Vulnerability Repair Using Reinforcement Learning with Large Language Models https://arxiv.org/abs/2401.07031 Code Security Vulnerability Repair Using Reinforcement Learning with Large Language ModelsWith the recent advancement of Large Language Models (LLMs), generating functionally correct code has become less complicated for a wide array of developers. While using LLMs has sped up the functional development process, it poses a heavy risk to code secarxiv.orgIntroducti.. 2025. 2. 18. [2025-1] 차승우 - Titans: Learning to Memorize at Test Time https://arxiv.org/abs/2501.00663 Titans: Learning to Memorize at Test TimeOver more than a decade there has been an extensive research effort on how to effectively utilize recurrent models and attention. While recurrent models aim to compress the data into a fixed-size memory (called hidden state), attention allows attending toarxiv.org 0. Abstract 순환 모델은 데이터를 고정된 크기의 메모리(hidden state)로 압축하는 것을 .. 2025. 2. 17. 이전 1 2 3 4 5 6 ··· 16 다음