Studying the Reduction Techniques for Mining Engineering Datasets
Over the world, companies often have huge datasets as data warehouses collection. The enormous size could make difficulty to analyze the data. The main reason, the complexity of data in terms of number of attributes and number of cases. To overcome this problem could be done by using a sufficient nu...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
University of Sindh
2018-04-01
|
Series: | University of Sindh Journal of Information and Communication Technology |
Subjects: | |
Online Access: | http://sujo.usindh.edu.pk/index.php/USJICT/article/view/4164/2837 |
Summary: | Over the world, companies often have huge datasets as data warehouses collection. The enormous size could make difficulty to analyze the data. The main reason, the complexity of data in terms of number of attributes and number of cases. To overcome this problem could be done by using a sufficient number of attributes and cases before mining this dataset. In Data Mining field, several methods could be used to reduce the attributes number and similar cases. This paper presents a study to test three reduction methods on engineering domain using five datasets. The three methods are: Genetic Algorithm (GA), Principal Component Analysis (PCA), and Johnson technique. The five datasets where obtained from UCI machine learning archive. The study examines which reduction method can be proper for datasets in Engineering field. It can be done by identifying the three reduction methods ranking based on percentage accuracy and number of selected attributes. |
---|---|
ISSN: | 2521-5582 2523-1235 |