Survey on Deep Learning with Imbalanced Data Sets

碩士 === 國立政治大學 === 應用數學系 === 108 === This paper is a survey on deep learning with imbalanced data sets and anomaly detection. We create two imbalanced data sets from MNIST for multi-classification task with minority classes 0,1,4,6,7 and binary classification task with minority class 0. Our data set...

Full description

Bibliographic Details
Main Authors:	Tsai, Cheng-Hsiao, 蔡承孝
Other Authors:	蔡炎龍
Format:	Others
Language:	en_US
Published:	2019
Online Access:	http://ndltd.ncl.edu.tw/handle/b8r339

id	ndltd-TW-108NCCU5507001
record_format	oai_dc
spelling	ndltd-TW-108NCCU55070012019-10-12T03:34:53Z http://ndltd.ncl.edu.tw/handle/b8r339 Survey on Deep Learning with Imbalanced Data Sets 深度學習在不平衡數據集之研究 Tsai, Cheng-Hsiao 蔡承孝碩士國立政治大學應用數學系 108 This paper is a survey on deep learning with imbalanced data sets and anomaly detection. We create two imbalanced data sets from MNIST for multi-classification task with minority classes 0,1,4,6,7 and binary classification task with minority class 0. Our data sets are highly imbalanced with imbalanced rate ρ = 2500 and we use convolutional neural network(CNN) for training. In anomaly detection,we use the pretrained CNN handwriting classifier to decide the 18 cat and dog pictures are handwriting pictures or not. Due to the data set is imbalanced, the baseline model have poor performance on minority classes. Hence, we use 6 and 7 different methods to adjust our model. We find that the focal loss function and random over-sampling(ROS) have best performance on multi-classification task and binary classification task on our imbalanced data sets but the cost sensitive learning method is not suitable for our imbalanced data sets. By confidence estimation, our classifier successfully judge all the pictures of cat and dog are not handwriting picture. 蔡炎龍 2019 學位論文 ; thesis 168 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立政治大學 === 應用數學系 === 108 === This paper is a survey on deep learning with imbalanced data sets and anomaly detection. We create two imbalanced data sets from MNIST for multi-classification task with minority classes 0,1,4,6,7 and binary classification task with minority class 0. Our data sets are highly imbalanced with imbalanced rate ρ = 2500 and we use convolutional neural network(CNN) for training. In anomaly detection,we use the pretrained CNN handwriting classifier to decide the 18 cat and dog pictures are handwriting pictures or not. Due to the data set is imbalanced, the baseline model have poor performance on minority classes. Hence, we use 6 and 7 different methods to adjust our model. We find that the focal loss function and random over-sampling(ROS) have best performance on multi-classification task and binary classification task on our imbalanced data sets but the cost sensitive learning method is not suitable for our imbalanced data sets. By confidence estimation, our classifier successfully judge all the pictures of cat and dog are not handwriting picture.
author2	蔡炎龍
author_facet	蔡炎龍 Tsai, Cheng-Hsiao 蔡承孝
author	Tsai, Cheng-Hsiao 蔡承孝
spellingShingle	Tsai, Cheng-Hsiao 蔡承孝 Survey on Deep Learning with Imbalanced Data Sets
author_sort	Tsai, Cheng-Hsiao
title	Survey on Deep Learning with Imbalanced Data Sets
title_short	Survey on Deep Learning with Imbalanced Data Sets
title_full	Survey on Deep Learning with Imbalanced Data Sets
title_fullStr	Survey on Deep Learning with Imbalanced Data Sets
title_full_unstemmed	Survey on Deep Learning with Imbalanced Data Sets
title_sort	survey on deep learning with imbalanced data sets
publishDate	2019
url	http://ndltd.ncl.edu.tw/handle/b8r339
work_keys_str_mv	AT tsaichenghsiao surveyondeeplearningwithimbalanceddatasets AT càichéngxiào surveyondeeplearningwithimbalanceddatasets AT tsaichenghsiao shēndùxuéxízàibùpínghéngshùjùjízhīyánjiū AT càichéngxiào shēndùxuéxízàibùpínghéngshùjùjízhīyánjiū
_version_	1719263857802739712

Survey on Deep Learning with Imbalanced Data Sets

Similar Items