Survey on Deep Learning with Imbalanced Data Sets

碩士 === 國立政治大學 === 應用數學系 === 108 === This paper is a survey on deep learning with imbalanced data sets and anomaly detection. We create two imbalanced data sets from MNIST for multi­-classification task with minority classes 0,1,4,6,7 and binary classification task with minority class 0. Our data set...

Full description

Bibliographic Details
Main Authors: Tsai, Cheng-Hsiao, 蔡承孝
Other Authors: 蔡炎龍
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/b8r339
Description
Summary:碩士 === 國立政治大學 === 應用數學系 === 108 === This paper is a survey on deep learning with imbalanced data sets and anomaly detection. We create two imbalanced data sets from MNIST for multi­-classification task with minority classes 0,1,4,6,7 and binary classification task with minority class 0. Our data sets are highly imbalanced with imbalanced rate ρ = 2500 and we use convolutional neural network(CNN) for training. In anomaly detection,we use the pretrained CNN handwriting classifier to decide the 18 cat and dog pictures are handwriting pictures or not. Due to the data set is imbalanced, the baseline model have poor performance on minority classes. Hence, we use 6 and 7 different methods to adjust our model. We find that the focal loss function and random over­-sampling(ROS) have best performance on multi­-classification task and binary classification task on our imbalanced data sets but the cost sensitive learning method is not suitable for our imbalanced data sets. By confidence estimation, our classifier successfully judge all the pictures of cat and dog are not handwriting picture.