Building predictive models for MERS-CoV infections using data mining techniques
Summary: Background: Recently, the outbreak of MERS-CoV infections caused worldwide attention to Saudi Arabia. The novel virus belongs to the coronaviruses family, which is responsible for causing mild to moderate colds. The control and command center of Saudi Ministry of Health issues a daily repo...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2016-11-01
|
Series: | Journal of Infection and Public Health |
Online Access: | http://www.sciencedirect.com/science/article/pii/S1876034116301460 |
id |
doaj-73dc97606b564a43a5d42d67714de5c3 |
---|---|
record_format |
Article |
spelling |
doaj-73dc97606b564a43a5d42d67714de5c32020-11-25T01:48:36ZengElsevierJournal of Infection and Public Health1876-03412016-11-0196744748Building predictive models for MERS-CoV infections using data mining techniquesIsra Al-Turaiki0Mona Alshahrani1Tahani Almutairi2Information Technology Department, College of Computer and Information Sciences, King Saud University, Saudi ArabiaCorresponding author.; Information Technology Department, College of Computer and Information Sciences, King Saud University, Saudi ArabiaInformation Technology Department, College of Computer and Information Sciences, King Saud University, Saudi ArabiaSummary: Background: Recently, the outbreak of MERS-CoV infections caused worldwide attention to Saudi Arabia. The novel virus belongs to the coronaviruses family, which is responsible for causing mild to moderate colds. The control and command center of Saudi Ministry of Health issues a daily report on MERS-CoV infection cases. The infection with MERS-CoV can lead to fatal complications, however little information is known about this novel virus. In this paper, we apply two data mining techniques in order to better understand the stability and the possibility of recovery from MERS-CoV infections. Method: The Naive Bayes classifier and J48 decision tree algorithm were used to build our models. The dataset used consists of 1082 records of cases reported between 2013 and 2015. In order to build our prediction models, we split the dataset into two groups. The first group combined recovery and death records. A new attribute was created to indicate the record type, such that the dataset can be used to predict the recovery from MERS-CoV. The second group contained the new case records to be used to predict the stability of the infection based on the current status attribute. Results: The resulting recovery models indicate that healthcare workers are more likely to survive. This could be due to the vaccinations that healthcare workers are required to get on regular basis. As for the stability models using J48, two attributes were found to be important for predicting stability: symptomatic and age. Old patients are at high risk of developing MERS-CoV complications. Finally, the performance of all the models was evaluated using three measures: accuracy, precision, and recall. In general, the accuracy of the models is between 53.6% and 71.58%. Conclusion: We believe that the performance of the prediction models can be enhanced with the use of more patient data. As future work, we plan to directly contact hospitals in Riyadh in order to collect more information related to patients with MERS-CoV infections. Keywords: MERS-CoV, Data mining, Decision tree, J48, Naive Bayes, Classificationhttp://www.sciencedirect.com/science/article/pii/S1876034116301460 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Isra Al-Turaiki Mona Alshahrani Tahani Almutairi |
spellingShingle |
Isra Al-Turaiki Mona Alshahrani Tahani Almutairi Building predictive models for MERS-CoV infections using data mining techniques Journal of Infection and Public Health |
author_facet |
Isra Al-Turaiki Mona Alshahrani Tahani Almutairi |
author_sort |
Isra Al-Turaiki |
title |
Building predictive models for MERS-CoV infections using data mining techniques |
title_short |
Building predictive models for MERS-CoV infections using data mining techniques |
title_full |
Building predictive models for MERS-CoV infections using data mining techniques |
title_fullStr |
Building predictive models for MERS-CoV infections using data mining techniques |
title_full_unstemmed |
Building predictive models for MERS-CoV infections using data mining techniques |
title_sort |
building predictive models for mers-cov infections using data mining techniques |
publisher |
Elsevier |
series |
Journal of Infection and Public Health |
issn |
1876-0341 |
publishDate |
2016-11-01 |
description |
Summary: Background: Recently, the outbreak of MERS-CoV infections caused worldwide attention to Saudi Arabia. The novel virus belongs to the coronaviruses family, which is responsible for causing mild to moderate colds. The control and command center of Saudi Ministry of Health issues a daily report on MERS-CoV infection cases. The infection with MERS-CoV can lead to fatal complications, however little information is known about this novel virus. In this paper, we apply two data mining techniques in order to better understand the stability and the possibility of recovery from MERS-CoV infections. Method: The Naive Bayes classifier and J48 decision tree algorithm were used to build our models. The dataset used consists of 1082 records of cases reported between 2013 and 2015. In order to build our prediction models, we split the dataset into two groups. The first group combined recovery and death records. A new attribute was created to indicate the record type, such that the dataset can be used to predict the recovery from MERS-CoV. The second group contained the new case records to be used to predict the stability of the infection based on the current status attribute. Results: The resulting recovery models indicate that healthcare workers are more likely to survive. This could be due to the vaccinations that healthcare workers are required to get on regular basis. As for the stability models using J48, two attributes were found to be important for predicting stability: symptomatic and age. Old patients are at high risk of developing MERS-CoV complications. Finally, the performance of all the models was evaluated using three measures: accuracy, precision, and recall. In general, the accuracy of the models is between 53.6% and 71.58%. Conclusion: We believe that the performance of the prediction models can be enhanced with the use of more patient data. As future work, we plan to directly contact hospitals in Riyadh in order to collect more information related to patients with MERS-CoV infections. Keywords: MERS-CoV, Data mining, Decision tree, J48, Naive Bayes, Classification |
url |
http://www.sciencedirect.com/science/article/pii/S1876034116301460 |
work_keys_str_mv |
AT israalturaiki buildingpredictivemodelsformerscovinfectionsusingdataminingtechniques AT monaalshahrani buildingpredictivemodelsformerscovinfectionsusingdataminingtechniques AT tahanialmutairi buildingpredictivemodelsformerscovinfectionsusingdataminingtechniques |
_version_ |
1725011164732063744 |