전체 글78 [2023-2] 백승우 - RUBi: Reducing Unimodal Biases for Visual Question Answering RUBi: Reducing Unimodal Biases in Visual Question Answering Visual Question Answering (VQA) is the task of answering questions about an image. Some VQA models often exploit unimodal biases to provide the correct answer without using the image information. As a result, they suffer from a huge drop in performance whe arxiv.org 0. Abstract 일부 VQA 모델은 image 정보를 사용하지 않고, 정답을 도출하기 위해 unimodal bias를 이용.. 2023. 11. 20. [2023-2] 백승우 - Show and Tell: A Neural Image Caption Generator Show and Tell: A Neural Image Caption Generator Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing. In this paper, we present a generative model based on a deep recurrent architecture that com arxiv.org Abstract CV와 기계번역을 결합하여, 심층 반복 아키텍처에 기반한 생성 모델 훈련 이미지가 주어졌을 때 목표 설명 문장의 가능성을 최대화하도록.. 2023. 11. 5. 이전 1 ··· 17 18 19 20 다음