Performance Evaluation of Simple K -Mean and Parallel K -Mean Clustering Algorithms: Big Data Business Process Management Concept

Data is the most valuable asset in any firm. As time passes, the data expands at a breakneck speed. A major research issue is the extraction of meaningful information from a complex and huge data source. Clustering is one of the data extraction methods. The basic K-Mean and Parallel K-Mean partition...

Full description

Bibliographic Details
Main Authors: Ali, S. (Author), Elmannai, H. (Author), Hadjouni, M. (Author), Jameel, A. (Author), Khan, I. (Author), Serat, A.M (Author), Zada, I. (Author), Zeeshan, M. (Author)
Format: Article
Language:English
Published: Hindawi Limited 2022
Subjects:
Online Access:View Fulltext in Publisher
LEADER 03149nam a2200409Ia 4500
001 10.1155-2022-1277765
008 220718s2022 CNT 000 0 und d
020 |a 1574017X (ISSN) 
245 1 0 |a Performance Evaluation of Simple K -Mean and Parallel K -Mean Clustering Algorithms: Big Data Business Process Management Concept 
260 0 |b Hindawi Limited  |c 2022 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1155/2022/1277765 
520 3 |a Data is the most valuable asset in any firm. As time passes, the data expands at a breakneck speed. A major research issue is the extraction of meaningful information from a complex and huge data source. Clustering is one of the data extraction methods. The basic K-Mean and Parallel K-Mean partition clustering algorithms work by picking random starting centroids. The basic and K-Mean parallel clustering methods are investigated in this work using two different datasets with sizes of 10000 and 5000, respectively. The findings of the Simple K-Mean clustering algorithms alter throughout numerous runs or iterations, according to the study, and so iterations differ for each run or execution. In some circumstances, the clustering algorithms' outcomes are always different, and the algorithms separate and identify unique properties of the K-Mean Simple clustering algorithm from the K-Mean Parallel clustering algorithm. Differentiating these features will improve cluster quality, lapsed time, and iterations. Experiments are designed to show that parallel algorithms considerably improve the Simple K-Mean techniques. The findings of the parallel techniques are also consistent; however, the Simple K-Mean algorithm's results vary from run to run. Both the 10,000 and 5000 data item datasets are divided into ten subdatasets for ten different client systems. Clusters are generated in two iterations, i.e., the time it takes for all client systems to complete one iteration (mentioned in chapter number 4). In the first execution, Client No. 5 has the longest elapsed time (8 ms), whereas the longest elapsed time in the following iterations is 6 ms, for a total elapsed time of 12 ms for the K-Mean clustering technique. In addition, the Parallel algorithms reduce the number of executions and the time it takes to complete a task. © 2022 Islam Zada et al. 
650 0 4 |a Big data 
650 0 4 |a Business Process 
650 0 4 |a Data business 
650 0 4 |a Data mining 
650 0 4 |a Data-source 
650 0 4 |a Extraction 
650 0 4 |a Iterative methods 
650 0 4 |a K-means 
650 0 4 |a K-means clustering 
650 0 4 |a K-means clustering algorithms 
650 0 4 |a Parallel algorithms 
650 0 4 |a Parallel clustering 
650 0 4 |a Performances evaluation 
650 0 4 |a Process management 
650 0 4 |a Research issues 
650 0 4 |a Simple++ 
700 1 |a Ali, S.  |e author 
700 1 |a Elmannai, H.  |e author 
700 1 |a Hadjouni, M.  |e author 
700 1 |a Jameel, A.  |e author 
700 1 |a Khan, I.  |e author 
700 1 |a Serat, A.M.  |e author 
700 1 |a Zada, I.  |e author 
700 1 |a Zeeshan, M.  |e author 
773 |t Mobile Information Systems  |x 1574017X (ISSN)  |g 2022