Hepatitis C Virus (HCV) Prediction by Machine Learning Techniques

Hepatitis C being as a prevalent disease in the world especially in countries like Egypt. It is estimated that 3-4 million new cases every year, indicating as a public health problem and should be addressed with identification and treatment policies. In the initial stage, it is asymptomatic however...

Full description

Bibliographic Details
Main Authors:	Satish CR Nandipati, Chew XinYing, Khaw Khai Wah
Format:	Article
Language:	English
Published:	ARQII PUBLICATION 2020-03-01
Series:	Applications of Modelling and Simulation
Subjects:	classification feature selection hepatitis c virus machine learning prediction multi and binary class labels python and r tools
Online Access:	http://arqiipubl.com/ojs/index.php/AMS_Journal/article/view/122/82

id	doaj-351cecc5941146bf8007af6a4de403c2
record_format	Article
spelling	doaj-351cecc5941146bf8007af6a4de403c22020-11-25T02:31:43ZengARQII PUBLICATIONApplications of Modelling and Simulation2600-80842020-03-01489100Hepatitis C Virus (HCV) Prediction by Machine Learning TechniquesSatish CR Nandipati0Chew XinYing1Khaw Khai Wah2School of Computer Sciences, Universiti Sains Malaysia, Pulau Pinang, MalaysiaSchool of Computer Sciences, Universiti Sains Malaysia, Pulau Pinang, MalaysiaSchool of Management, Universiti Sains Malaysia, Pulau Pinang, MalaysiaHepatitis C being as a prevalent disease in the world especially in countries like Egypt. It is estimated that 3-4 million new cases every year, indicating as a public health problem and should be addressed with identification and treatment policies. In the initial stage, it is asymptomatic however when infection progress it leads to chronic conditions such as liver cirrhosis and hepatocellular carcinoma. Some of the various non-invasive serum biochemical markers are used to identify this disease. This study aims to know the performance comparisons between multi and binary class labels of the same dataset, not limited to tool comparison, and to know which selected features play a key role in the prediction of Hepatitis C Virus (HCV) by using Egyptian patient’s dataset. The highest accuracy is shown by KNN (51.06%, R) and random forest (54.56%, Python) in multi and binary class label respectively. The overall evaluation metrics comparison shows R as a better tool for this case. On the other hand, the performance score of the binary class shows better that the multiclass label. The multi-feature selection methods did not show any similar arrangement/topology in the ranking order of selected features. Finally, the 12 selected features by principal component analysis show similar performances to complete dataset and also the 21 selected features, thus showing these features may play a role in the prediction of the HCV dataset.http://arqiipubl.com/ojs/index.php/AMS_Journal/article/view/122/82classificationfeature selectionhepatitis c virusmachine learningprediction multi and binary class labelspython and r tools
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Satish CR Nandipati Chew XinYing Khaw Khai Wah
spellingShingle	Satish CR Nandipati Chew XinYing Khaw Khai Wah Hepatitis C Virus (HCV) Prediction by Machine Learning Techniques Applications of Modelling and Simulation classification feature selection hepatitis c virus machine learning prediction multi and binary class labels python and r tools
author_facet	Satish CR Nandipati Chew XinYing Khaw Khai Wah
author_sort	Satish CR Nandipati
title	Hepatitis C Virus (HCV) Prediction by Machine Learning Techniques
title_short	Hepatitis C Virus (HCV) Prediction by Machine Learning Techniques
title_full	Hepatitis C Virus (HCV) Prediction by Machine Learning Techniques
title_fullStr	Hepatitis C Virus (HCV) Prediction by Machine Learning Techniques
title_full_unstemmed	Hepatitis C Virus (HCV) Prediction by Machine Learning Techniques
title_sort	hepatitis c virus (hcv) prediction by machine learning techniques
publisher	ARQII PUBLICATION
series	Applications of Modelling and Simulation
issn	2600-8084
publishDate	2020-03-01
description	Hepatitis C being as a prevalent disease in the world especially in countries like Egypt. It is estimated that 3-4 million new cases every year, indicating as a public health problem and should be addressed with identification and treatment policies. In the initial stage, it is asymptomatic however when infection progress it leads to chronic conditions such as liver cirrhosis and hepatocellular carcinoma. Some of the various non-invasive serum biochemical markers are used to identify this disease. This study aims to know the performance comparisons between multi and binary class labels of the same dataset, not limited to tool comparison, and to know which selected features play a key role in the prediction of Hepatitis C Virus (HCV) by using Egyptian patient’s dataset. The highest accuracy is shown by KNN (51.06%, R) and random forest (54.56%, Python) in multi and binary class label respectively. The overall evaluation metrics comparison shows R as a better tool for this case. On the other hand, the performance score of the binary class shows better that the multiclass label. The multi-feature selection methods did not show any similar arrangement/topology in the ranking order of selected features. Finally, the 12 selected features by principal component analysis show similar performances to complete dataset and also the 21 selected features, thus showing these features may play a role in the prediction of the HCV dataset.
topic	classification feature selection hepatitis c virus machine learning prediction multi and binary class labels python and r tools
url	http://arqiipubl.com/ojs/index.php/AMS_Journal/article/view/122/82
work_keys_str_mv	AT satishcrnandipati hepatitiscvirushcvpredictionbymachinelearningtechniques AT chewxinying hepatitiscvirushcvpredictionbymachinelearningtechniques AT khawkhaiwah hepatitiscvirushcvpredictionbymachinelearningtechniques
_version_	1724822480048095232

Hepatitis C Virus (HCV) Prediction by Machine Learning Techniques

Similar Items