BALANCED VS IMBALANCED TRAINING DATA: CLASSIFYING RAPIDEYE DATA WITH SUPPORT VECTOR MACHINES

The accuracy of supervised image classification is highly dependent upon several factors such as the design of training set (sample selection, composition, purity and size), resolution of input imagery and landscape heterogeneity. The design of training set is still a challenging issue since the sen...

Full description

Bibliographic Details
Main Authors: M. Ustuner, F. B. Sanli, S. Abdikan
Format: Article
Language:English
Published: Copernicus Publications 2016-06-01
Series:The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Online Access:https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLI-B7/379/2016/isprs-archives-XLI-B7-379-2016.pdf
id doaj-c76e9f94904f45a4a8dac1f859c06b2b
record_format Article
spelling doaj-c76e9f94904f45a4a8dac1f859c06b2b2020-11-24T21:52:00ZengCopernicus PublicationsThe International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences1682-17502194-90342016-06-01XLI-B737938410.5194/isprs-archives-XLI-B7-379-2016BALANCED VS IMBALANCED TRAINING DATA: CLASSIFYING RAPIDEYE DATA WITH SUPPORT VECTOR MACHINESM. Ustuner0F. B. Sanli1S. Abdikan2Department of Geomatics Engineering, Yildiz Technical University, Istanbul, TurkeyDepartment of Geomatics Engineering, Yildiz Technical University, Istanbul, TurkeyDepartment of Geomatics Engineering, Bulent Ecevit University, Zonguldak, TurkeyThe accuracy of supervised image classification is highly dependent upon several factors such as the design of training set (sample selection, composition, purity and size), resolution of input imagery and landscape heterogeneity. The design of training set is still a challenging issue since the sensitivity of classifier algorithm at learning stage is different for the same dataset. In this paper, the classification of RapidEye imagery with balanced and imbalanced training data for mapping the crop types was addressed. Classification with imbalanced training data may result in low accuracy in some scenarios. Support Vector Machines (SVM), Maximum Likelihood (ML) and Artificial Neural Network (ANN) classifications were implemented here to classify the data. For evaluating the influence of the balanced and imbalanced training data on image classification algorithms, three different training datasets were created. Two different balanced datasets which have 70 and 100 pixels for each class of interest and one imbalanced dataset in which each class has different number of pixels were used in classification stage. Results demonstrate that ML and NN classifications are affected by imbalanced training data in resulting a reduction in accuracy (from 90.94% to 85.94% for ML and from 91.56% to 88.44% for NN) while SVM is not affected significantly (from 94.38% to 94.69%) and slightly improved. Our results highlighted that SVM is proven to be a very robust, consistent and effective classifier as it can perform very well under balanced and imbalanced training data situations. Furthermore, the training stage should be precisely and carefully designed for the need of adopted classifier.https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLI-B7/379/2016/isprs-archives-XLI-B7-379-2016.pdf
collection DOAJ
language English
format Article
sources DOAJ
author M. Ustuner
F. B. Sanli
S. Abdikan
spellingShingle M. Ustuner
F. B. Sanli
S. Abdikan
BALANCED VS IMBALANCED TRAINING DATA: CLASSIFYING RAPIDEYE DATA WITH SUPPORT VECTOR MACHINES
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
author_facet M. Ustuner
F. B. Sanli
S. Abdikan
author_sort M. Ustuner
title BALANCED VS IMBALANCED TRAINING DATA: CLASSIFYING RAPIDEYE DATA WITH SUPPORT VECTOR MACHINES
title_short BALANCED VS IMBALANCED TRAINING DATA: CLASSIFYING RAPIDEYE DATA WITH SUPPORT VECTOR MACHINES
title_full BALANCED VS IMBALANCED TRAINING DATA: CLASSIFYING RAPIDEYE DATA WITH SUPPORT VECTOR MACHINES
title_fullStr BALANCED VS IMBALANCED TRAINING DATA: CLASSIFYING RAPIDEYE DATA WITH SUPPORT VECTOR MACHINES
title_full_unstemmed BALANCED VS IMBALANCED TRAINING DATA: CLASSIFYING RAPIDEYE DATA WITH SUPPORT VECTOR MACHINES
title_sort balanced vs imbalanced training data: classifying rapideye data with support vector machines
publisher Copernicus Publications
series The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
issn 1682-1750
2194-9034
publishDate 2016-06-01
description The accuracy of supervised image classification is highly dependent upon several factors such as the design of training set (sample selection, composition, purity and size), resolution of input imagery and landscape heterogeneity. The design of training set is still a challenging issue since the sensitivity of classifier algorithm at learning stage is different for the same dataset. In this paper, the classification of RapidEye imagery with balanced and imbalanced training data for mapping the crop types was addressed. Classification with imbalanced training data may result in low accuracy in some scenarios. Support Vector Machines (SVM), Maximum Likelihood (ML) and Artificial Neural Network (ANN) classifications were implemented here to classify the data. For evaluating the influence of the balanced and imbalanced training data on image classification algorithms, three different training datasets were created. Two different balanced datasets which have 70 and 100 pixels for each class of interest and one imbalanced dataset in which each class has different number of pixels were used in classification stage. Results demonstrate that ML and NN classifications are affected by imbalanced training data in resulting a reduction in accuracy (from 90.94% to 85.94% for ML and from 91.56% to 88.44% for NN) while SVM is not affected significantly (from 94.38% to 94.69%) and slightly improved. Our results highlighted that SVM is proven to be a very robust, consistent and effective classifier as it can perform very well under balanced and imbalanced training data situations. Furthermore, the training stage should be precisely and carefully designed for the need of adopted classifier.
url https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLI-B7/379/2016/isprs-archives-XLI-B7-379-2016.pdf
work_keys_str_mv AT mustuner balancedvsimbalancedtrainingdataclassifyingrapideyedatawithsupportvectormachines
AT fbsanli balancedvsimbalancedtrainingdataclassifyingrapideyedatawithsupportvectormachines
AT sabdikan balancedvsimbalancedtrainingdataclassifyingrapideyedatawithsupportvectormachines
_version_ 1725877474675392512