Studying the Reduction Techniques for Mining Engineering Datasets

Over the world, companies often have huge datasets as data warehouses collection. The enormous size could make difficulty to analyze the data. The main reason, the complexity of data in terms of number of attributes and number of cases. To overcome this problem could be done by using a sufficient nu...

Full description

Bibliographic Details
Main Author: Mustafa Ali Abuzaraida
Format: Article
Language:English
Published: University of Sindh 2018-04-01
Series:University of Sindh Journal of Information and Communication Technology
Subjects:
Online Access:http://sujo.usindh.edu.pk/index.php/USJICT/article/view/4164/2837
Description
Summary:Over the world, companies often have huge datasets as data warehouses collection. The enormous size could make difficulty to analyze the data. The main reason, the complexity of data in terms of number of attributes and number of cases. To overcome this problem could be done by using a sufficient number of attributes and cases before mining this dataset. In Data Mining field, several methods could be used to reduce the attributes number and similar cases. This paper presents a study to test three reduction methods on engineering domain using five datasets. The three methods are: Genetic Algorithm (GA), Principal Component Analysis (PCA), and Johnson technique. The five datasets where obtained from UCI machine learning archive. The study examines which reduction method can be proper for datasets in Engineering field. It can be done by identifying the three reduction methods ranking based on percentage accuracy and number of selected attributes.
ISSN:2521-5582
2523-1235