[2025-1] 유경석 - Road Extraction by Deep Residual U-Net

Abstract

Road extraction은 원격 감지 이미지 분석 분야의 뜨거운 연구 주제
Residual learning과 U-Net의 결합 구조를 통해 Road extraction 수행
1) Residual unit은 Deep network의 training이 더욱 쉽게 이루어지도록 함.
2) Skip connection은 information propagation을 통해, 더 적은 parameter로 더 좋은 성능을 보임.
Public road dataset을 분석하는 연구에서, 다른 network에 비해 ResUNet이 더 좋은 성능을 보였음.

1. Intorduction

Road extraction

원격 감지 분야의 대표적인 기술로, 자동화 네비게이션, 무인 자동차, 도시 설계, 지리 정보 업데이트 분야 등 다양한 기술에 적용 가능한 분야이다. 하지만 고해상도의 이미지를 원격으로 감지하는 기술은 여전히 어려움이 많은데, 대표적으로 noise, occlusion(다른 객체에 의해 물체가 가려지는 현상), Background의 복잡도 등의 과제가 남아있다.

이 중 Road area extraction은 이미지에서 도로가 차지하는 영역을 감지하는 기술로, pixel 단위의 labeling을 필요로 한다. 따라서 Segmentation, 또는 Pixel-level classification problem을 해결하는 방식으로 접근할 수 있다. Segmentation 문제를 해결하는 여러 가지 전통적 방법 (SVM, hierachical graph-based image segmentation 등)이 고안되었으나, 그 중에서 Deep Learning을 적용한 방식에서 높은 성능과 잠재력을 보였다.

Minh and Hinton : Restricted Boltzmann machine (RBMs) 적용, pre-processing (차원 축소), post-processing (불완전한 영역 후처리) step 적용
Saito et al. : Convolutional Neural Network (CNNs) 적용, 높은 성능을 보임.

Deep Learning은 높은 성능을 내기 위해서 Layer를 여러 개 사용하는 Deep architecture를 사용하나, vanishing gradient 등의 문제가 발생함. 이를 해결하기 위한 대표적인 architecture가 Deep residual learning과 U-Net임.

Deep residual learning framework (He et al.) : Identity mapping을 활용하여 skip connection 구현
U-Net (Ronneberger et al.) : 다양한 level의 feature map을 합하여 low level detail information과 high level semantic information이 결합된 정보 추출, biomedical image segmentation에서 높은 성과

Deep residual U-Net (ResUNet) : Deep residual learning과 U-Net에서 영감을 받아 설계된 구조. 두 가지 구조의 이점을 모두 취함.

U-Net과 달리 1) residual unit를 basic block으로 사용하며, 2) cropping operation 과정이 불필요함.

2. Methodology

A. Deep ResUnet

1) U-Net : low level detail + high level semantic information

Data augmentation을 통해 제한된 data 양을 증대
Infromation propagation : low level feature를 대응하는 high level에 전달 → training 도중 backward propagation이 가능, high level semantic feature에 low level detail이 더해짐

2) Residual unit : training이 효과적으로 이루어지고 Degradation problem 해결

Multi-layer neural network로 인해 발생한 문제 극복
Residual unit으로 구성 : $\mathcal{F}$(residual function), $f$(activation function), $h$(identity mapping function)

Batch normalization, ReLU activation, Convolutional layer의 조합으로 이루어짐

3) Deep ResUNet : U-Net과 residual neural network를 결합 → training이 쉽게 이루어지도록 하고 Information propagation을 통해, 더 적은 parameter로 더 좋은 성능을 보임.

총 7 level architecture, 3개의 part로 나뉨 : Encoding, Bridge, Decoding
- Encoding : Input image를 compact representation으로 변환
- Decoding : pixel-wise categorization (=Semantic segmentation) 이미지로 변환
- Bridge : encoding과 decoding을 연결

Residual unit으로 구성 : 3x3 convolution block, identity mapping
- Encoding : 3개의 residual unit, pooling downsampling 대신 stride 2 convolution 사용
- Decoding : 3개의 residual unit, 각 unit에 진입하기 전 upsampling + encoding feature map concatenation 수행
- 마지막 레벨에서는 1x1 convolution, sigmod activation layer 통해 multi-channel feature map 생성
UNet(23)에 비해 15개의 적은 convolution layer 사용, cropping 불필요

B. Loss fuction

parameter $W$ 추출, $Net(I_i;W)$와 ground truth $s_i$ 간의 차이 최소화 → MSE 사용, SGD로 최적화

다른 derivable loss function 역시 사용 가능 (U-Net은 pixel-wise cross entropy 사용)

C. Result refinement

Input, Output 동일하게 224x224 size. Overlap strategy 이용 (Boundary 부분의 정확도 상승)

3. Experiment

Massachusetts roads dataset 사용. Mnih’s method (RBMs), Saito’s method (CNN), U-Net과 비교
Relaxed precision, recall 수치 및 break-even point 비교

Deep ResUNet이 높은 relaxed precision, recall 수치, 가장 좋은 성능을 보임.
UNet에 비해 parameter는 1/4 정도이지만 높은 성능을 보임.

(a) input image; (b) Ground truth; (c) CNN; (d) U-Net; (e) ResUNet

ResUNet에서 수행한 segmentation이 가장 높은 성능을 보임 : 적은 Noise, 교차 도로에서 깔끔한 처리
Context information를 고려 : 특징이 비슷한 물체와 구별, Occlusion 해결, Parking lot 내부의 도로는 인지하지 않음

4. Conclusion

고해상도 원격 감지 이미지 분야에서 ResUNet를 적용시켰을 때, Residual learning과 U-Net의 장점을 모두 취할 수 있음
Residual unit의 skip connection과 U-Net의 information propagation은 training을 쉽게 할 뿐만 아니라 간단하고 강력한 neural network를 형성할 수 있었음
parameter 수가 적음에도 불구하고, 다른 모델에 비해 높은 성능을 보임.

'Computer Vision' 카테고리의 다른 글

[2025-1] 박경태 - Low-Resolution Object Recognition with Cross-Resolution Relational Contrastive Distillation (0)	2025.01.17
[2025-1] 최민서 - Generative Modeling by Estimating Gradients of the Data Distribution (0)	2025.01.17
[2025-1] 전윤경-U-Net: Convolutional Networks for BiomedicalImage Segmentation (0)	2025.01.10
[2025-1] 김유현 - Conditional Generative Adversarial Nets (0)	2025.01.09
[2025-1] 한영웅, 전윤경 - UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation (IEEE 2019) (0)	2025.01.08