전체 글271 [2025-1] 백승우 - Data Selection for Language Models via Importance Resampling Data Selection for Language Models via Importance ResamplingSelecting a suitable pretraining dataset is crucial for both general-domain (e.g., GPT-3) and domain-specific (e.g., Codex) language models (LMs). We formalize this problem as selecting a subset of a large raw unlabeled dataset to match a desired target diarxiv.org1. MethodDSIR FrameworkLarge raw dataset에서 target data의 distribution과 일치하.. 2025. 3. 3. [2025-1] 임재열 - Playing Atari with Deep Reinforcement Learning 해당 논문은 2013년에 Google Deepmind에서 발표한 것으로 심층 강화학습의 시작을 알린 논문으로 여겨집니다. [Playing Atari with DRL]https://arxiv.org/abs/1312.5602 Playing Atari with Deep Reinforcement LearningWe present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolutional neural network, trained with a variant of Q-.. 2025. 3. 1. [2025-1] 이재호 - Deep Reinforcement Learning with Double Q-learning https://arxiv.org/abs/1509.06461 Hado van Hasselt, Arthur Guez, David Silver - Google DeepMind Deep Reinforcement Learning with Double Q-learningThe popular Q-learning algorithm is known to overestimate action values under certain conditions. It was not previously known whether, in practice, such overestimations are common, whether they harm performance, and whether they can generally be prevent.. 2025. 2. 28. [2025-1] 김유현 - Wasserstein GAN https://arxiv.org/abs/1701.07875 Wasserstein GANWe introduce a new algorithm named WGAN, an alternative to traditional GAN training. In this new model, we show that we can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debuggiarxiv.org 1. IntroductionGAN의 목적은 데이터 x의 분포 P(x)를 직접 학습하는 것이 목적이다. P(x)를 매개변수 θ를 사용하여 .. 2025. 2. 28. 이전 1 ··· 11 12 13 14 15 16 17 ··· 68 다음