Summary: | Data management systems are essential elements for any organization which is dealing with large volume of data now a days. Due to increase in data volume, and its complexities, it has become more challenging job for workload management system to maintain its performance. So, there is a need of such a system that can autonomically deal with such complexities with less or without human involvement. Performance of these systems can be improved by making the systems well-aware about the workload entering into the system. The workload of a prevalent typical database and data warehouse system can be characterized into three types that is Online Transaction Processing (OLTP), Decision Support Systems (DSS) and Mixed type of workload. Currently, autonomic characterization of workload into a binary class such as OLTP and DSS is being carried out as reported in the literature, however, characterizing the workload into three types that refers to a multi-class classification problem is relatively a more challenging task. In this study, we propose a novel optimized Case-based Reasoning (CBR) approach based on clustering for autonomically characterizing the workload into multi-class types before entering into the system. We implement four phases of CBR along with case-base generation and map it to the elements of autonomic MAPE-K model. In Retrieve phase, k-means clustering is used for enhancing retrieval efficiency and workload types predictions are made in Reuse phase. Genetic Algorithm is used in Revise and Adapt phase of CBR. Few autonomic self_* characteristics are incorporated to make it autonomic. We performed various experiments and results show that the proposed model outperforms in prediction as compared to existing approaches. We performed post-hoc test for the validation of results in comparison with other machine learning classifiers using the Friedman test that show that the proposed model stands out as the best classifier.
|