Optimized Sample Selection in SVM Classification by Combining with DMSP-OLS, Landsat NDVI and GlobeLand30 Products for Extracting Urban Built-Up Areas

The accuracy of training samples used for data classification methods, such as support vector machines (SVMs), has had a considerable positive impact on the results of urban area extractions. To improve the accuracy of urban built-up area extractions, this paper presents a sample-optimized approach...

Full description

Bibliographic Details
Main Authors: Xiaolong Ma, Xiaohua Tong, Sicong Liu, Xin Luo, Huan Xie, Chengming Li
Format: Article
Language:English
Published: MDPI AG 2017-03-01
Series:Remote Sensing
Subjects:
Online Access:http://www.mdpi.com/2072-4292/9/3/236
Description
Summary:The accuracy of training samples used for data classification methods, such as support vector machines (SVMs), has had a considerable positive impact on the results of urban area extractions. To improve the accuracy of urban built-up area extractions, this paper presents a sample-optimized approach for classifying urban area data using a combination of the Defense Meteorological Satellite Program-Operational Linescan System (DMSP-OLS) for nighttime light data, Landsat images, and GlobeLand30, which is a 30-m global land cover data product. The proposed approach consists of three main components: (1) initial sample generation and data classification into built-up and non-urban built-up areas based on the maximum and minimum intervals of digital numbers from the DMSP-OLS data, respectively; (2) refined sample selection and optimization by the probability threshold of each pixel based on vegetation-cover, using the Landsat-derived normalized differential vegetation index (NDVI) and artificial surfaces extracted from the GlobeLand30 product as the constraints; (3) iterative classification and urban built-up area data extraction using the relationship between these three aspects of data collection together with the training sets. Experiments were conducted for several cities in western China using this proposed approach for the extraction of built-up areas, which were classified using urban construction statistical yearbooks and Landsat images and were compared with data obtained from traditional data collection methods, such as the threshold dichotomy method and the improved neighborhood focal statistics method. An analysis of the empirical results indicated that (1) the sample training process was improved using the proposed method, and the overall accuracy (OA) increased from 89% to 96% for both the optimized and non-optimized sample selection; (2) the proposed method had a relative error of less than 10%, as calculated by an accuracy assessment; (3) the overall and individual class accuracy were higher for artificial surfaces in GlobeLand30; and (4) the average OA obviously improved and the Kappa coefficient in the case of Chengdu increased from 0.54 to 0.80. Therefore, the experimental results demonstrated that our proposed approach is a reliable solution for extracting urban built-up areas with a high degree of accuracy.
ISSN:2072-4292