[PyTorch] 2-1. Object Recognition with ResNet

Notice

Recent Posts

Recent Comments

Link

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

euphoriaO-O

[PyTorch] 2-1. Object Recognition with ResNet 본문

Machine Learning/Pytorch

[PyTorch] 2-1. Object Recognition with ResNet

euphoria0-0 2020. 7. 11. 18:20

This article is based on the book "Deep Learning with PyTorch"
https://pytorch.org/deep-learning-with-pytorch

2. Pretrained Networks

내용에 따라 이미지에 레이블링하는 모델 : AlexNet, ResNet
실제 이미지로부터 새로운 이미지를 제작하는 모델
영문으로 이미지 내용을 설명하는 모델

2.1. Object-Recognition

ImageNet dataset
- 예시의 모델들은 이 데이터셋으로 훈련됨
- Tasks: Image Classification(이미지 카테고리 분류), Object Localization(객체 위치 파악), Object Detection(이미지 객체 식별 및 레이블링), Scence Classification(상황 분류), Scence Parsing(이미지의 의미 카테고리와 연관된 영역 분류)
- 과정: 입력 이미지로부터 resize, center, normalize 등 수행 후 pretrained model(가중치)을 적용하여 max score로부터 레이블을 얻는다.
TorchVision projects
- AlexNet, ResNet, Inception v3 등의 모델과 ImageNet과 같은 데이터를 포함

from torchvision import models

AlexNet
- 2012년 ILSVRC에서 오류율 15.4%로 우승
- 1) 각 블록은 여러 곱셈, 덧셈, filter 등 포함
- 2) 마지막 필터에서 클래스 수만큼의 아웃풋 확률을 생성하고 이로부터 최대 확률을 가지는 클래스로 분류

alexnet = models.AlexNet()

ResNet
- 2015년 ImageNet

ResNet의 핵심, Shortcut Connections 출처: paper

resnet = models.resnet101(pretrained=True)

(1) Settings for Run

resnet

layer: operations, the building blocks of a neural network
Bottleneck: 필터와 비선형함수의 순차적인 cascade와 마지막 scoring 레이어
transforms: 입력 이미지를 적절한 크기로 만들고 값들을 같은 숫자 범위에 있도록 전처리

from torchvision import transforms
preprocess = transforms.Compose([
	transforms.Resize(256),     # 이미지크기를 256x256로
	transforms.CenterCrop(224), # 이미지중심을 224x224로
	transforms.ToTensor(),      # 텐서로 변환
	transforms.Normalize(       # 정규화
	mean=[0.485, 0.456, 0.406],
	std=[0.229, 0.224, 0.225]
	)])

(2) 새로운 이미지를 이용해 테스트하기

1) Pillow 모듈을 이용하여 이미지 보기

from PIL import Image
img = Image.open("../data/p1ch2/bobby.jpg")
img.show()

2) 이미지 전처리

img_t = preprocess(img)

3) unsqueeze

import torch
batch_t = torch.unsqueeze(img_t, 0)

(3) Run!

새로운 이미지를 학습된 모델에 돌리기(inference)
1) eval 실행: 하지 않으면 pretrained network에서 batch normalization, dropout 등을 못한다고 함.

resnet.eval()

2) 추론(inference)
- 예측된 레이블 리스트에서 네트워크에서 가장 높은 점수를 얻은 인덱스에서 레이블을 선택함

out = resnet(batch_t)
out

(4) 클래스 할당

ImageNet dataset에서 1000개의 레이블 가져오기

with open('../data/p1ch2/imagenet_classes.txt') as f:
	labels = [line.strip() for line in f.readlines()]

예측 확률이 최대인 값의 인덱스 가져오기

_, index = torch.max(out, 1)

위의 인덱스로부터 클래스 가져오기
softmax를 사용해 output을 정규화/sum(확률화)하여 대강의 확률을 구한다
아래 결과에 의하면 이 모델에 의해 해당 사진은 golden retriever라고 96% 확신한다.

percentage = torch.nn.functional.softmax(out, dim=1)[0] * 100
labels[index[0]], percentage[index[0]].item()

('golden retriever', 96.29334259033203)

골든 리트리버 외에 그 다음으로 확신하는 클래스와 확률을 알 수 있다.

_, indices = torch.sort(out, descending=True)
[(labels[idx], percentage[idx].item()) for idx in indices[0][:5]]

[('golden retriever', 96.29334259033203),
('Labrador retriever', 2.80812406539917),
('cocker spaniel, English cocker spaniel, cocker', 0.28267428278923035),
('redbone', 0.2086310237646103),
('tennis ball', 0.11621569097042084)]

결과 중에서 (테니스 볼, 0.1) 이 있는데, 이는 사진에 이 개체가 포함되었기 때문이다. 즉, 사진에 포함된 다른 개체에 의해 잘못 학습/예측될 수도 있으므로 주의해야 한다.

'Machine Learning > Pytorch' 카테고리의 다른 글

[PyTorch] 3. Tensor (0)	2020.07.16
[PyTorch] 2-4. Torch Hub (0)	2020.07.14
[PyTorch] 2-3. Image Captioning (0)	2020.07.14
[PyTorch] 2-2. GAN, CycleGAN (0)	2020.07.12

'Machine Learning/Pytorch' Related Articles

Comments

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

euphoriaO-O

euphoriaO-O

[PyTorch] 2-1. Object Recognition with ResNet 본문

[PyTorch] 2-1. Object Recognition with ResNet

2. Pretrained Networks

2.1. Object-Recognition

(1) Settings for Run

(2) 새로운 이미지를 이용해 테스트하기

(3) Run!

(4) 클래스 할당

'Machine Learning > Pytorch' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역