Data classification algorithm for data-intensive computing environments
Abstract Data-intensive computing has received substantial attention since the arrival of the big data era. Research on data mining in data-intensive computing environments is still in the initial stage. In this paper, a decision tree classification algorithm called MR-DIDC is proposed that is based...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2017-12-01
|
Series: | EURASIP Journal on Wireless Communications and Networking |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s13638-017-1002-4 |
id |
doaj-e267b665d33240df91b3751e163755a5 |
---|---|
record_format |
Article |
spelling |
doaj-e267b665d33240df91b3751e163755a52020-11-25T02:01:38ZengSpringerOpenEURASIP Journal on Wireless Communications and Networking1687-14992017-12-012017111010.1186/s13638-017-1002-4Data classification algorithm for data-intensive computing environmentsTiedong Chen0Shifeng Liu1Daqing Gong2Honghu Gao3School of Economics and Management, Beijing Jiaotong UniversitySchool of Economics and Management, Beijing Jiaotong UniversitySchool of Economics and Management, Beijing Jiaotong UniversitySchool of Economics and Management, Beijing Jiaotong UniversityAbstract Data-intensive computing has received substantial attention since the arrival of the big data era. Research on data mining in data-intensive computing environments is still in the initial stage. In this paper, a decision tree classification algorithm called MR-DIDC is proposed that is based on the programming framework of MapReduce and the SPRINT algorithm. MR-DIDC inherits the advantages of MapReduce, which make the algorithm more suitable for data-intensive computing applications. The performance of the algorithm is evaluated based on an example. The results of experiments showed that MR-DIDC can shorten the operation time and improve the accuracy in a big data environment.http://link.springer.com/article/10.1186/s13638-017-1002-4Data-intensiveData miningMR-DIDCMapReduce |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Tiedong Chen Shifeng Liu Daqing Gong Honghu Gao |
spellingShingle |
Tiedong Chen Shifeng Liu Daqing Gong Honghu Gao Data classification algorithm for data-intensive computing environments EURASIP Journal on Wireless Communications and Networking Data-intensive Data mining MR-DIDC MapReduce |
author_facet |
Tiedong Chen Shifeng Liu Daqing Gong Honghu Gao |
author_sort |
Tiedong Chen |
title |
Data classification algorithm for data-intensive computing environments |
title_short |
Data classification algorithm for data-intensive computing environments |
title_full |
Data classification algorithm for data-intensive computing environments |
title_fullStr |
Data classification algorithm for data-intensive computing environments |
title_full_unstemmed |
Data classification algorithm for data-intensive computing environments |
title_sort |
data classification algorithm for data-intensive computing environments |
publisher |
SpringerOpen |
series |
EURASIP Journal on Wireless Communications and Networking |
issn |
1687-1499 |
publishDate |
2017-12-01 |
description |
Abstract Data-intensive computing has received substantial attention since the arrival of the big data era. Research on data mining in data-intensive computing environments is still in the initial stage. In this paper, a decision tree classification algorithm called MR-DIDC is proposed that is based on the programming framework of MapReduce and the SPRINT algorithm. MR-DIDC inherits the advantages of MapReduce, which make the algorithm more suitable for data-intensive computing applications. The performance of the algorithm is evaluated based on an example. The results of experiments showed that MR-DIDC can shorten the operation time and improve the accuracy in a big data environment. |
topic |
Data-intensive Data mining MR-DIDC MapReduce |
url |
http://link.springer.com/article/10.1186/s13638-017-1002-4 |
work_keys_str_mv |
AT tiedongchen dataclassificationalgorithmfordataintensivecomputingenvironments AT shifengliu dataclassificationalgorithmfordataintensivecomputingenvironments AT daqinggong dataclassificationalgorithmfordataintensivecomputingenvironments AT honghugao dataclassificationalgorithmfordataintensivecomputingenvironments |
_version_ |
1724956581327536128 |