전체 글305 [2025-1] 박서형 - PSGAN ( Pedestrian-Synthesis-GAN: GeneratingPedestrian Data in Real Scene and Beyond ) https://arxiv.org/abs/1804.02047 Pedestrian-Synthesis-GAN: Generating Pedestrian Data in Real Scene and BeyondState-of-the-art pedestrian detection models have achieved great success in many benchmarks. However, these models require lots of annotation information and the labeling process usually takes much time and efforts. In this paper, we propose a method to gearxiv.org 1. Introduction Pedest.. 2025. 5. 17. [2025-1] 임재열- DRÆM – A discriminatively trained reconstruction embedding for surface anomaly detection DRAEM은 2021 ICCV에서 발표된 복원-원본 이미지 쌍을 활용해 anomaly detection을 학습하는 새로운 unsupervised 모델을 제안하는 논문입니다. [DRAEM]https://arxiv.org/abs/2108.07610 DRAEM -- A discriminatively trained reconstruction embedding for surface anomaly detectionVisual surface anomaly detection aims to detect local image regions that significantly deviate from normal appearance. Recent surface anomaly detection methods rely on .. 2025. 5. 17. [2025-1] 유경석 - FlexiViT: One Model for All Patch Sizes https://arxiv.org/pdf/2212.08013https://github.com/google-research/big_vision GitHub - google-research/big_vision: Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more. - google-research/big_visiongithub.comAbstractViT의 patch size는 speed와 accuracy를 결정하는 인자이지만, patch size를 변경하는 것.. 2025. 5. 17. [2025-1]박제우 - Scaling Language-Image Pre-training via Masking https://arxiv.org/abs/2212.00794 Scaling Language-Image Pre-training via MaskingWe present Fast Language-Image Pre-training (FLIP), a simple and more efficient method for training CLIP. Our method randomly masks out and removes a large portion of image patches during training. Masking allows us to learn from more image-text pairs givearxiv.org https://blog.outta.ai/284 본 논문은 지난번 리뷰했던 자연어 지도 학습 모.. 2025. 5. 17. 이전 1 2 3 4 5 6 7 ··· 77 다음