An ensemble method based multilayer dynamic system to predict cardiovascular disease using machine learning approach

Cardiovascular disease is defined as a set of conditions related to the disorder of the heart and blood vessels. Predicting and diagnosing cardiovascular disease is significant to ensure the appropriate treatment of this disease. Machine learning approaches are generally utilized to automatically de...

Full description

Bibliographic Details
Main Authors: Mohammed Nasir Uddin, Rajib Kumar Halder
Format: Article
Language:English
Published: Elsevier 2021-01-01
Series:Informatics in Medicine Unlocked
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352914821000745
id doaj-5751f17bcb204b5cb0d24fa23be0bd95
record_format Article
spelling doaj-5751f17bcb204b5cb0d24fa23be0bd952021-06-19T04:55:05ZengElsevierInformatics in Medicine Unlocked2352-91482021-01-0124100584An ensemble method based multilayer dynamic system to predict cardiovascular disease using machine learning approachMohammed Nasir Uddin0Rajib Kumar Halder1Department of Computer Science and Engineering, Jagannath University, Dhaka, BangladeshCorresponding author.; Department of Computer Science and Engineering, Jagannath University, Dhaka, BangladeshCardiovascular disease is defined as a set of conditions related to the disorder of the heart and blood vessels. Predicting and diagnosing cardiovascular disease is significant to ensure the appropriate treatment of this disease. Machine learning approaches are generally utilized to automatically detect the hidden patterns in vast amounts of data without human intervention. In the early stage of cardiovascular disease, a machine learning model can aid physicians in making the right decision about the medication. This research aims to develop an intelligent agent to predict cardiovascular disease to investigate what steps should be taken before any untoward incident occurs. This paper proposes an ensemble method-based multilayer dynamic system (MLDS) that can improve its current knowledge in every layer. The proposed model applies Correlation Attribute Evaluator (CAE), Gain Ratio Attribute Evaluator (GRAE), Information Gain Attribute Evaluator (IGAE), Lasso, and Extra Trees classifier (ETC) for feature selection. Finally, Random Forest (RF), Naïve Bayes (NB), and Gradient Boosting (GB) classifiers combinedly construct the ensemble method for classification in the model. The K Nearest Neighbor (KNN) algorithm is applied to find the test data's neighborhood data points while the base classifiers mentioned are failed to classify correctly in any layer. To test the proposed model's efficiency, we have used a realistic dataset (70,000 instances) collected from Kaggle. The proposed model has achieved 88.84%, 89.44%, 91.56%, 92.72%, and 94.16% accuracy based on the train and test data's different splitting ratios (50:50, 60:40, 70:30, 80:20, and 87.5:12.5). Our proposed model has achieved a 0.94 AUC value. AUC = 0.94 means it has a 94% probability of correctly classifying positive and negative classes, Whereas the splitting ratio is 87.5:12.5. The Cleveland, Hungarian, and Cleveland-Hungary-Switzerland-Long Beach datasets have also been applied to train the model, and the model achieved 98.88%, 99.53%, 99.98%, 98.36%, 96.66%, 97.77%, 99.56, and 94.37% accuracy depending on the different splitting ratios of these datasets. The proposed model has been compared to five other models, indicating that the proposed model can effectively predict cardiovascular disease.http://www.sciencedirect.com/science/article/pii/S2352914821000745Machine learningCardiovascular diseaseFeature selectionEnsemble modelClassification
collection DOAJ
language English
format Article
sources DOAJ
author Mohammed Nasir Uddin
Rajib Kumar Halder
spellingShingle Mohammed Nasir Uddin
Rajib Kumar Halder
An ensemble method based multilayer dynamic system to predict cardiovascular disease using machine learning approach
Informatics in Medicine Unlocked
Machine learning
Cardiovascular disease
Feature selection
Ensemble model
Classification
author_facet Mohammed Nasir Uddin
Rajib Kumar Halder
author_sort Mohammed Nasir Uddin
title An ensemble method based multilayer dynamic system to predict cardiovascular disease using machine learning approach
title_short An ensemble method based multilayer dynamic system to predict cardiovascular disease using machine learning approach
title_full An ensemble method based multilayer dynamic system to predict cardiovascular disease using machine learning approach
title_fullStr An ensemble method based multilayer dynamic system to predict cardiovascular disease using machine learning approach
title_full_unstemmed An ensemble method based multilayer dynamic system to predict cardiovascular disease using machine learning approach
title_sort ensemble method based multilayer dynamic system to predict cardiovascular disease using machine learning approach
publisher Elsevier
series Informatics in Medicine Unlocked
issn 2352-9148
publishDate 2021-01-01
description Cardiovascular disease is defined as a set of conditions related to the disorder of the heart and blood vessels. Predicting and diagnosing cardiovascular disease is significant to ensure the appropriate treatment of this disease. Machine learning approaches are generally utilized to automatically detect the hidden patterns in vast amounts of data without human intervention. In the early stage of cardiovascular disease, a machine learning model can aid physicians in making the right decision about the medication. This research aims to develop an intelligent agent to predict cardiovascular disease to investigate what steps should be taken before any untoward incident occurs. This paper proposes an ensemble method-based multilayer dynamic system (MLDS) that can improve its current knowledge in every layer. The proposed model applies Correlation Attribute Evaluator (CAE), Gain Ratio Attribute Evaluator (GRAE), Information Gain Attribute Evaluator (IGAE), Lasso, and Extra Trees classifier (ETC) for feature selection. Finally, Random Forest (RF), Naïve Bayes (NB), and Gradient Boosting (GB) classifiers combinedly construct the ensemble method for classification in the model. The K Nearest Neighbor (KNN) algorithm is applied to find the test data's neighborhood data points while the base classifiers mentioned are failed to classify correctly in any layer. To test the proposed model's efficiency, we have used a realistic dataset (70,000 instances) collected from Kaggle. The proposed model has achieved 88.84%, 89.44%, 91.56%, 92.72%, and 94.16% accuracy based on the train and test data's different splitting ratios (50:50, 60:40, 70:30, 80:20, and 87.5:12.5). Our proposed model has achieved a 0.94 AUC value. AUC = 0.94 means it has a 94% probability of correctly classifying positive and negative classes, Whereas the splitting ratio is 87.5:12.5. The Cleveland, Hungarian, and Cleveland-Hungary-Switzerland-Long Beach datasets have also been applied to train the model, and the model achieved 98.88%, 99.53%, 99.98%, 98.36%, 96.66%, 97.77%, 99.56, and 94.37% accuracy depending on the different splitting ratios of these datasets. The proposed model has been compared to five other models, indicating that the proposed model can effectively predict cardiovascular disease.
topic Machine learning
Cardiovascular disease
Feature selection
Ensemble model
Classification
url http://www.sciencedirect.com/science/article/pii/S2352914821000745
work_keys_str_mv AT mohammednasiruddin anensemblemethodbasedmultilayerdynamicsystemtopredictcardiovasculardiseaseusingmachinelearningapproach
AT rajibkumarhalder anensemblemethodbasedmultilayerdynamicsystemtopredictcardiovasculardiseaseusingmachinelearningapproach
AT mohammednasiruddin ensemblemethodbasedmultilayerdynamicsystemtopredictcardiovasculardiseaseusingmachinelearningapproach
AT rajibkumarhalder ensemblemethodbasedmultilayerdynamicsystemtopredictcardiovasculardiseaseusingmachinelearningapproach
_version_ 1721371741072130048