분류 전체보기301 [2025-1] 이재호 - Diffusion Model Alignment Using Direct Preference Optimization https://arxiv.org/abs/2311.12908 - Bram Wallace et al, CVPR 2023 # Abstract 문제 인식:LLM은 RLHF로 사람의 선호에 맞게 정렬되지만, Diffusion Model은 아직 사람의 선호 학습이 널리 적용되지 않음.기존 접근:Text to image diffusion 모델에서는 고품질 이미지와 캡션으로 미세조정(fine-tuning)하는 방식이 일반적이었음.제안 방법:논문은 Diffusion-DPO라는 새로운 방법을 제안. 이는 **Direct Preference Optimization (DPO)**를 확산 모델에 맞게 변형하여, 사람이 선택한 이미지 쌍을 기반으로 직접 학습함. 1. Introduction 배경:Text-to-image diff.. 2025. 5. 31. [2025-1] 황징아이 - Convolutional Character Networks 논문 : https://arxiv.org/abs/1910.07954 Convolutional Character NetworksRecent progress has been made on developing a unified framework for joint text detection and recognition in natural images, but existing joint models were mostly built on two-stage framework by involving ROI pooling, which can degrade the performance on rearxiv.org 1. Introduction기존 Text Reading 모델은 2단계를 거친다텍스트 검출 (Text Detect.. 2025. 5. 31. [2025-1] 임재열 - Self-guided Knowledge-injected Graph NeuralNetwork for Alzheimer’s Diseases Self-guided Knowledge-injected Graph Neural Network for Alzheimer’s Diseases은 2024년 MICCAI에서 발표된, 1. 알츠하이머 진단을 위해 뇌 영역 간 그래프에 의료 지식을 주입하고,2. 개인별 연결 특성을 반영하는 self-guided attention GNN을 제안한 논문입니다. [SGK-GNN]https://papers.miccai.org/miccai-2024/678-Paper0869.html Self-guided Knowledge-injected Graph Neural Network for Alzheimer’s DiseasesAbstract Graph neural networks (GNNs) are proficient machine .. 2025. 5. 31. [2025-1] 박제우 - An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale https://arxiv.org/abs/2010.11929 An Image is Worth 16x16 Words: Transformers for Image Recognition at ScaleWhile the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to reparxiv.org 본 논문은 앞서 리뷰한 CLIP과 FLIP 논.. 2025. 5. 30. 이전 1 2 3 4 ··· 76 다음