Data classification algorithm for data-intensive computing environments

Abstract Data-intensive computing has received substantial attention since the arrival of the big data era. Research on data mining in data-intensive computing environments is still in the initial stage. In this paper, a decision tree classification algorithm called MR-DIDC is proposed that is based...

Full description

Bibliographic Details
Main Authors: Tiedong Chen, Shifeng Liu, Daqing Gong, Honghu Gao
Format: Article
Language:English
Published: SpringerOpen 2017-12-01
Series:EURASIP Journal on Wireless Communications and Networking
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13638-017-1002-4
id doaj-e267b665d33240df91b3751e163755a5
record_format Article
spelling doaj-e267b665d33240df91b3751e163755a52020-11-25T02:01:38ZengSpringerOpenEURASIP Journal on Wireless Communications and Networking1687-14992017-12-012017111010.1186/s13638-017-1002-4Data classification algorithm for data-intensive computing environmentsTiedong Chen0Shifeng Liu1Daqing Gong2Honghu Gao3School of Economics and Management, Beijing Jiaotong UniversitySchool of Economics and Management, Beijing Jiaotong UniversitySchool of Economics and Management, Beijing Jiaotong UniversitySchool of Economics and Management, Beijing Jiaotong UniversityAbstract Data-intensive computing has received substantial attention since the arrival of the big data era. Research on data mining in data-intensive computing environments is still in the initial stage. In this paper, a decision tree classification algorithm called MR-DIDC is proposed that is based on the programming framework of MapReduce and the SPRINT algorithm. MR-DIDC inherits the advantages of MapReduce, which make the algorithm more suitable for data-intensive computing applications. The performance of the algorithm is evaluated based on an example. The results of experiments showed that MR-DIDC can shorten the operation time and improve the accuracy in a big data environment.http://link.springer.com/article/10.1186/s13638-017-1002-4Data-intensiveData miningMR-DIDCMapReduce
collection DOAJ
language English
format Article
sources DOAJ
author Tiedong Chen
Shifeng Liu
Daqing Gong
Honghu Gao
spellingShingle Tiedong Chen
Shifeng Liu
Daqing Gong
Honghu Gao
Data classification algorithm for data-intensive computing environments
EURASIP Journal on Wireless Communications and Networking
Data-intensive
Data mining
MR-DIDC
MapReduce
author_facet Tiedong Chen
Shifeng Liu
Daqing Gong
Honghu Gao
author_sort Tiedong Chen
title Data classification algorithm for data-intensive computing environments
title_short Data classification algorithm for data-intensive computing environments
title_full Data classification algorithm for data-intensive computing environments
title_fullStr Data classification algorithm for data-intensive computing environments
title_full_unstemmed Data classification algorithm for data-intensive computing environments
title_sort data classification algorithm for data-intensive computing environments
publisher SpringerOpen
series EURASIP Journal on Wireless Communications and Networking
issn 1687-1499
publishDate 2017-12-01
description Abstract Data-intensive computing has received substantial attention since the arrival of the big data era. Research on data mining in data-intensive computing environments is still in the initial stage. In this paper, a decision tree classification algorithm called MR-DIDC is proposed that is based on the programming framework of MapReduce and the SPRINT algorithm. MR-DIDC inherits the advantages of MapReduce, which make the algorithm more suitable for data-intensive computing applications. The performance of the algorithm is evaluated based on an example. The results of experiments showed that MR-DIDC can shorten the operation time and improve the accuracy in a big data environment.
topic Data-intensive
Data mining
MR-DIDC
MapReduce
url http://link.springer.com/article/10.1186/s13638-017-1002-4
work_keys_str_mv AT tiedongchen dataclassificationalgorithmfordataintensivecomputingenvironments
AT shifengliu dataclassificationalgorithmfordataintensivecomputingenvironments
AT daqinggong dataclassificationalgorithmfordataintensivecomputingenvironments
AT honghugao dataclassificationalgorithmfordataintensivecomputingenvironments
_version_ 1724956581327536128