Using Machine Learning Algorithms to Predict Hepatitis B Surface Antigen Seroclearance

Hepatitis B surface antigen (HBsAg) seroclearance during treatment is associated with a better prognosis among patients with chronic hepatitis B (CHB). Significant gaps remain in our understanding on how to predict HBsAg seroclearance accurately and efficiently based on obtainable clinical informati...

Full description

Bibliographic Details
Main Authors: Xiaolu Tian, Yutian Chong, Yutao Huang, Pi Guo, Mengjie Li, Wangjian Zhang, Zhicheng Du, Xiangyong Li, Yuantao Hao
Format: Article
Language:English
Published: Hindawi Limited 2019-01-01
Series:Computational and Mathematical Methods in Medicine
Online Access:http://dx.doi.org/10.1155/2019/6915850
id doaj-ffc1ac19ff7043cb911237e14026abd8
record_format Article
spelling doaj-ffc1ac19ff7043cb911237e14026abd82020-11-25T00:16:15ZengHindawi LimitedComputational and Mathematical Methods in Medicine1748-670X1748-67182019-01-01201910.1155/2019/69158506915850Using Machine Learning Algorithms to Predict Hepatitis B Surface Antigen SeroclearanceXiaolu Tian0Yutian Chong1Yutao Huang2Pi Guo3Mengjie Li4Wangjian Zhang5Zhicheng Du6Xiangyong Li7Yuantao Hao8Department of Medical Statistics and Epidemiology & Health Information Research Center & Guangdong Key Laboratory of Medicine, School of Public Health, Sun Yat-sen University, Guangzhou 510080, ChinaDepartment of Infectious Diseases, The Third Affiliated Hospital, Sun Yat-sen University, Guangzhou 510630, ChinaSchool of Data and Computer Science, Sun Yat-sen University, Guangzhou 510006, ChinaDepartment of Public Health, Medical College of Shantou University, Shantou 515063, ChinaDepartment of Medical Statistics and Epidemiology & Health Information Research Center & Guangdong Key Laboratory of Medicine, School of Public Health, Sun Yat-sen University, Guangzhou 510080, ChinaDepartment of Environmental Health Sciences, School of Public Health, University at Albany, State University of New York, Rensselaer 12144, USADepartment of Medical Statistics and Epidemiology & Health Information Research Center & Guangdong Key Laboratory of Medicine, School of Public Health, Sun Yat-sen University, Guangzhou 510080, ChinaDepartment of Infectious Diseases, The Third Affiliated Hospital, Sun Yat-sen University, Guangzhou 510630, ChinaDepartment of Medical Statistics and Epidemiology & Health Information Research Center & Guangdong Key Laboratory of Medicine, School of Public Health, Sun Yat-sen University, Guangzhou 510080, ChinaHepatitis B surface antigen (HBsAg) seroclearance during treatment is associated with a better prognosis among patients with chronic hepatitis B (CHB). Significant gaps remain in our understanding on how to predict HBsAg seroclearance accurately and efficiently based on obtainable clinical information. This study aimed to identify the optimal model to predict HBsAg seroclearance. We obtained the laboratory and demographic information for 2,235 patients with CHB from the South China Hepatitis Monitoring and Administration (SCHEMA) cohort. HBsAg seroclearance occurred in 106 patients in total. We developed models based on four algorithms, including the extreme gradient boosting (XGBoost), random forest (RF), decision tree (DCT), and logistic regression (LR). The optimal model was identified by the area under the receiver operating characteristic curve (AUC). The AUCs for XGBoost, RF, DCT, and LR models were 0.891, 0.829, 0.619, and 0.680, respectively, with XGBoost showing the best predictive performance. The variable importance plot of the XGBoost model indicated that the level of HBsAg was of high importance followed by age and the level of hepatitis B virus (HBV) DNA. Machine learning algorithms, especially XGBoost, have appropriate performance in predicting HBsAg seroclearance. The results showed the potential of machine learning algorithms for predicting HBsAg seroclearance utilizing obtainable clinical data.http://dx.doi.org/10.1155/2019/6915850
collection DOAJ
language English
format Article
sources DOAJ
author Xiaolu Tian
Yutian Chong
Yutao Huang
Pi Guo
Mengjie Li
Wangjian Zhang
Zhicheng Du
Xiangyong Li
Yuantao Hao
spellingShingle Xiaolu Tian
Yutian Chong
Yutao Huang
Pi Guo
Mengjie Li
Wangjian Zhang
Zhicheng Du
Xiangyong Li
Yuantao Hao
Using Machine Learning Algorithms to Predict Hepatitis B Surface Antigen Seroclearance
Computational and Mathematical Methods in Medicine
author_facet Xiaolu Tian
Yutian Chong
Yutao Huang
Pi Guo
Mengjie Li
Wangjian Zhang
Zhicheng Du
Xiangyong Li
Yuantao Hao
author_sort Xiaolu Tian
title Using Machine Learning Algorithms to Predict Hepatitis B Surface Antigen Seroclearance
title_short Using Machine Learning Algorithms to Predict Hepatitis B Surface Antigen Seroclearance
title_full Using Machine Learning Algorithms to Predict Hepatitis B Surface Antigen Seroclearance
title_fullStr Using Machine Learning Algorithms to Predict Hepatitis B Surface Antigen Seroclearance
title_full_unstemmed Using Machine Learning Algorithms to Predict Hepatitis B Surface Antigen Seroclearance
title_sort using machine learning algorithms to predict hepatitis b surface antigen seroclearance
publisher Hindawi Limited
series Computational and Mathematical Methods in Medicine
issn 1748-670X
1748-6718
publishDate 2019-01-01
description Hepatitis B surface antigen (HBsAg) seroclearance during treatment is associated with a better prognosis among patients with chronic hepatitis B (CHB). Significant gaps remain in our understanding on how to predict HBsAg seroclearance accurately and efficiently based on obtainable clinical information. This study aimed to identify the optimal model to predict HBsAg seroclearance. We obtained the laboratory and demographic information for 2,235 patients with CHB from the South China Hepatitis Monitoring and Administration (SCHEMA) cohort. HBsAg seroclearance occurred in 106 patients in total. We developed models based on four algorithms, including the extreme gradient boosting (XGBoost), random forest (RF), decision tree (DCT), and logistic regression (LR). The optimal model was identified by the area under the receiver operating characteristic curve (AUC). The AUCs for XGBoost, RF, DCT, and LR models were 0.891, 0.829, 0.619, and 0.680, respectively, with XGBoost showing the best predictive performance. The variable importance plot of the XGBoost model indicated that the level of HBsAg was of high importance followed by age and the level of hepatitis B virus (HBV) DNA. Machine learning algorithms, especially XGBoost, have appropriate performance in predicting HBsAg seroclearance. The results showed the potential of machine learning algorithms for predicting HBsAg seroclearance utilizing obtainable clinical data.
url http://dx.doi.org/10.1155/2019/6915850
work_keys_str_mv AT xiaolutian usingmachinelearningalgorithmstopredicthepatitisbsurfaceantigenseroclearance
AT yutianchong usingmachinelearningalgorithmstopredicthepatitisbsurfaceantigenseroclearance
AT yutaohuang usingmachinelearningalgorithmstopredicthepatitisbsurfaceantigenseroclearance
AT piguo usingmachinelearningalgorithmstopredicthepatitisbsurfaceantigenseroclearance
AT mengjieli usingmachinelearningalgorithmstopredicthepatitisbsurfaceantigenseroclearance
AT wangjianzhang usingmachinelearningalgorithmstopredicthepatitisbsurfaceantigenseroclearance
AT zhichengdu usingmachinelearningalgorithmstopredicthepatitisbsurfaceantigenseroclearance
AT xiangyongli usingmachinelearningalgorithmstopredicthepatitisbsurfaceantigenseroclearance
AT yuantaohao usingmachinelearningalgorithmstopredicthepatitisbsurfaceantigenseroclearance
_version_ 1725383705121259520