본문 바로가기

책상 밖 세상을 경험할 수 있는 Playground를 제공하고, 수동적 학습에서 창조의 삶으로의 전환을 위한 새로운 라이프 스타일을 제시합니다.

Computer Vision

[2024-1] 염제원 - Siamese Neural Networks for One-Shot Image Recognition

by Scuttie 2024. 5. 6.

https://www.cs.cmu.edu/~rsalakhu/papers/oneshot1.pdf

1-1. Upsides of this approach

Capable of learning generic image features useful for making predictions about unknown class distributions even when very few examples are
available.
Easily trained using standard optimization techniques on pairs sampled
from the source data.
Provide a competitive approach that does not rely upon domain-specific
knowledge by instead exploiting deep learning techniques.

1-2. Learning Strategy

Learn a neural network that can discriminate between the class-identity
of image pairs (standard verification task)
Output of model is the probability that input images are belong to the
same class.

1-3. Test Phase

Along the given one images for each novel classes and given test
images for evaluation, evaluate probability that given each test image
are belong to the same class for each novel classes.
Predict by class with the highest probability.

Visualization of the approach

2-1. Model Architecture

Siamese Network: Use two identical networks that shares weights, and measure distance between embeddings to calculate similarity between them.
Verification Stage: Train a ConvNet so that it successfully outputs appropriate embedding of input image.
Classification Stage: Given K-way 1-Shot Classifcation Problem, classify a image with highest similarity.

Model Architecture

2-2. Prediction Vector

Prediction is sigmoid output of weighted L1-Distance

Prediction Vector and L1, L2 Distances

2-3. Loss Function

L1-Regularized BCE Loss

Loss Function

2-4. Optimization

Gradient Descent with Momentum and Regularizer
Momentum Learning Schedule
Weight Intialization with Normal Distribution
Hyperparameter Optimization
Affine Distortion

3-1. Experiments

Used Omniglot dataset, a dataset containing 1623 characters from 50 different alphabets,
each one hand-drawn by a group of 20 different people.

Accuracy on Omniglot Verification Task

Hierarchical Bayesian Program Learning (HBPL), needs information of stroke order. Unlike HBPL, Convolutional Siamese Net does not need any domain knowledge.

Comparing best one-shot accuracy from each type of network against baselines

Tried experiment of genearlization to MNIST dataset, while learned only Omniglot Dataset.

Results from MNIST 10-versus-1 one-shot classifcation task

'Computer Vision' 카테고리의 다른 글

[2024-1] 주서영 - A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks (1)	2024.05.12
[2024-1] 백승우 - (DeepSORT) SIMPLE ONLINE AND REALTIME TRACKING WITH A DEEP ASSOCIATION METRIC (2)	2024.05.07
[2024-1] 김경훈 - PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation (1)	2024.04.30
[2024-1] 주서영 - Don't Decay the Learning Rate, Increase the Batch Size (1)	2024.04.12
[2024-1] 양소정 - Generative Adversarial Networks (0)	2024.04.10

티스토리툴바