Machine learning methodologies versus cardiovascular risk scores, in predicting disease risk
Abstract Background The use of Cardiovascular Disease (CVD) risk estimation scores in primary prevention has long been established. However, their performance still remains a matter of concern. The aim of this study was to explore the potential of using ML methodologies on CVD prediction, especially...
Main Authors: | , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2018-12-01
|
Series: | BMC Medical Research Methodology |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12874-018-0644-1 |
id |
doaj-7768fde1fc96457ca27bc9eb6cecd5de |
---|---|
record_format |
Article |
spelling |
doaj-7768fde1fc96457ca27bc9eb6cecd5de2020-11-25T00:36:38ZengBMCBMC Medical Research Methodology1471-22882018-12-0118111110.1186/s12874-018-0644-1Machine learning methodologies versus cardiovascular risk scores, in predicting disease riskAlexandros C. Dimopoulos0Mara Nikolaidou1Francisco Félix Caballero2Worrawat Engchuan3Albert Sanchez-Niubo4Holger Arndt5José Luis Ayuso-Mateos6Josep Maria Haro7Somnath Chatterji8Ekavi N. Georgousopoulou9Christos Pitsavos10Demosthenes B. Panagiotakos11Department of Nutrition and Dietetics, School of Health Science and Education, Harokopio UniversityDepartment of Informatics & Telematics, School of Digital Technology, Harokopio UniversityDepartment of Preventive Medicine and Public Health, Universidad Autónoma de MadridThe Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick ChildrenParc Sanitari Sant Joan de DéuSPRING TECHNO GMBH & Co. KGDepartment of Preventive Medicine and Public Health, Universidad Autónoma de MadridCIBER of Epidemiology and Public HealthHealth Metrics and Measurement, World Health OrganizationDepartment of Nutrition and Dietetics, School of Health Science and Education, Harokopio UniversitySchool of Medicine, University of AthensDepartment of Nutrition and Dietetics, School of Health Science and Education, Harokopio UniversityAbstract Background The use of Cardiovascular Disease (CVD) risk estimation scores in primary prevention has long been established. However, their performance still remains a matter of concern. The aim of this study was to explore the potential of using ML methodologies on CVD prediction, especially compared to established risk tool, the HellenicSCORE. Methods Data from the ATTICA prospective study (n = 2020 adults), enrolled during 2001–02 and followed-up in 2011–12 were used. Three different machine-learning classifiers (k-NN, random forest, and decision tree) were trained and evaluated against 10-year CVD incidence, in comparison with the HellenicSCORE tool (a calibration of the ESC SCORE). Training datasets, consisting from 16 variables to only 5 variables, were chosen, with or without bootstrapping, in an attempt to achieve the best overall performance for the machine learning classifiers. Results Depending on the classifier and the training dataset the outcome varied in efficiency but was comparable between the two methodological approaches. In particular, the HellenicSCORE showed accuracy 85%, specificity 20%, sensitivity 97%, positive predictive value 87%, and negative predictive value 58%, whereas for the machine learning methodologies, accuracy ranged from 65 to 84%, specificity from 46 to 56%, sensitivity from 67 to 89%, positive predictive value from 89 to 91%, and negative predictive value from 24 to 45%; random forest gave the best results, while the k-NN gave the poorest results. Conclusions The alternative approach of machine learning classification produced results comparable to that of risk prediction scores and, thus, it can be used as a method of CVD prediction, taking into consideration the advantages that machine learning methodologies may offer.http://link.springer.com/article/10.1186/s12874-018-0644-1Cardiovascular diseaseRisk predictionMachine learningModel performance |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Alexandros C. Dimopoulos Mara Nikolaidou Francisco Félix Caballero Worrawat Engchuan Albert Sanchez-Niubo Holger Arndt José Luis Ayuso-Mateos Josep Maria Haro Somnath Chatterji Ekavi N. Georgousopoulou Christos Pitsavos Demosthenes B. Panagiotakos |
spellingShingle |
Alexandros C. Dimopoulos Mara Nikolaidou Francisco Félix Caballero Worrawat Engchuan Albert Sanchez-Niubo Holger Arndt José Luis Ayuso-Mateos Josep Maria Haro Somnath Chatterji Ekavi N. Georgousopoulou Christos Pitsavos Demosthenes B. Panagiotakos Machine learning methodologies versus cardiovascular risk scores, in predicting disease risk BMC Medical Research Methodology Cardiovascular disease Risk prediction Machine learning Model performance |
author_facet |
Alexandros C. Dimopoulos Mara Nikolaidou Francisco Félix Caballero Worrawat Engchuan Albert Sanchez-Niubo Holger Arndt José Luis Ayuso-Mateos Josep Maria Haro Somnath Chatterji Ekavi N. Georgousopoulou Christos Pitsavos Demosthenes B. Panagiotakos |
author_sort |
Alexandros C. Dimopoulos |
title |
Machine learning methodologies versus cardiovascular risk scores, in predicting disease risk |
title_short |
Machine learning methodologies versus cardiovascular risk scores, in predicting disease risk |
title_full |
Machine learning methodologies versus cardiovascular risk scores, in predicting disease risk |
title_fullStr |
Machine learning methodologies versus cardiovascular risk scores, in predicting disease risk |
title_full_unstemmed |
Machine learning methodologies versus cardiovascular risk scores, in predicting disease risk |
title_sort |
machine learning methodologies versus cardiovascular risk scores, in predicting disease risk |
publisher |
BMC |
series |
BMC Medical Research Methodology |
issn |
1471-2288 |
publishDate |
2018-12-01 |
description |
Abstract Background The use of Cardiovascular Disease (CVD) risk estimation scores in primary prevention has long been established. However, their performance still remains a matter of concern. The aim of this study was to explore the potential of using ML methodologies on CVD prediction, especially compared to established risk tool, the HellenicSCORE. Methods Data from the ATTICA prospective study (n = 2020 adults), enrolled during 2001–02 and followed-up in 2011–12 were used. Three different machine-learning classifiers (k-NN, random forest, and decision tree) were trained and evaluated against 10-year CVD incidence, in comparison with the HellenicSCORE tool (a calibration of the ESC SCORE). Training datasets, consisting from 16 variables to only 5 variables, were chosen, with or without bootstrapping, in an attempt to achieve the best overall performance for the machine learning classifiers. Results Depending on the classifier and the training dataset the outcome varied in efficiency but was comparable between the two methodological approaches. In particular, the HellenicSCORE showed accuracy 85%, specificity 20%, sensitivity 97%, positive predictive value 87%, and negative predictive value 58%, whereas for the machine learning methodologies, accuracy ranged from 65 to 84%, specificity from 46 to 56%, sensitivity from 67 to 89%, positive predictive value from 89 to 91%, and negative predictive value from 24 to 45%; random forest gave the best results, while the k-NN gave the poorest results. Conclusions The alternative approach of machine learning classification produced results comparable to that of risk prediction scores and, thus, it can be used as a method of CVD prediction, taking into consideration the advantages that machine learning methodologies may offer. |
topic |
Cardiovascular disease Risk prediction Machine learning Model performance |
url |
http://link.springer.com/article/10.1186/s12874-018-0644-1 |
work_keys_str_mv |
AT alexandroscdimopoulos machinelearningmethodologiesversuscardiovascularriskscoresinpredictingdiseaserisk AT maranikolaidou machinelearningmethodologiesversuscardiovascularriskscoresinpredictingdiseaserisk AT franciscofelixcaballero machinelearningmethodologiesversuscardiovascularriskscoresinpredictingdiseaserisk AT worrawatengchuan machinelearningmethodologiesversuscardiovascularriskscoresinpredictingdiseaserisk AT albertsanchezniubo machinelearningmethodologiesversuscardiovascularriskscoresinpredictingdiseaserisk AT holgerarndt machinelearningmethodologiesversuscardiovascularriskscoresinpredictingdiseaserisk AT joseluisayusomateos machinelearningmethodologiesversuscardiovascularriskscoresinpredictingdiseaserisk AT josepmariaharo machinelearningmethodologiesversuscardiovascularriskscoresinpredictingdiseaserisk AT somnathchatterji machinelearningmethodologiesversuscardiovascularriskscoresinpredictingdiseaserisk AT ekavingeorgousopoulou machinelearningmethodologiesversuscardiovascularriskscoresinpredictingdiseaserisk AT christospitsavos machinelearningmethodologiesversuscardiovascularriskscoresinpredictingdiseaserisk AT demosthenesbpanagiotakos machinelearningmethodologiesversuscardiovascularriskscoresinpredictingdiseaserisk |
_version_ |
1725304339603390464 |