Development and utility assessment of a machine learning bloodstream infection classifier in pediatric patients receiving cancer treatments

Abstract Background Objectives were to build a machine learning algorithm to identify bloodstream infection (BSI) among pediatric patients with cancer and hematopoietic stem cell transplantation (HSCT) recipients, and to compare this approach with presence of neutropenia to identify BSI. Methods We...

Full description

Bibliographic Details
Main Authors: Lillian Sung, Conor Corbin, Ethan Steinberg, Emily Vettese, Aaron Campigotto, Loreto Lecce, George A. Tomlinson, Nigam Shah
Format: Article
Language:English
Published: BMC 2020-11-01
Series:BMC Cancer
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12885-020-07618-2
id doaj-ab4362dc870545cdb30712252d4def02
record_format Article
spelling doaj-ab4362dc870545cdb30712252d4def022020-11-25T04:06:00ZengBMCBMC Cancer1471-24072020-11-012011910.1186/s12885-020-07618-2Development and utility assessment of a machine learning bloodstream infection classifier in pediatric patients receiving cancer treatmentsLillian Sung0Conor Corbin1Ethan Steinberg2Emily Vettese3Aaron Campigotto4Loreto Lecce5George A. Tomlinson6Nigam Shah7Division of Haematology/Oncology, The Hospital for Sick ChildrenBiomedical Informatics Research, Stanford UniversityBiomedical Informatics Research, Stanford UniversityDivision of Haematology/Oncology, The Hospital for Sick ChildrenDivision of Infectious Diseases, The Hospital for Sick ChildrenDivision of Neonatology, The Hospital for Sick ChildrenDepartment of Medicine, University Health NetworkBiomedical Informatics Research, Stanford UniversityAbstract Background Objectives were to build a machine learning algorithm to identify bloodstream infection (BSI) among pediatric patients with cancer and hematopoietic stem cell transplantation (HSCT) recipients, and to compare this approach with presence of neutropenia to identify BSI. Methods We included patients 0–18 years of age at cancer diagnosis or HSCT between January 2009 and November 2018. Eligible blood cultures were those with no previous blood culture (regardless of result) within 7 days. The primary outcome was BSI. Four machine learning algorithms were used: elastic net, support vector machine and two implementations of gradient boosting machine (GBM and XGBoost). Model training and evaluation were performed using temporally disjoint training (60%), validation (20%) and test (20%) sets. The best model was compared to neutropenia alone in the test set. Results Of 11,183 eligible blood cultures, 624 (5.6%) were positive. The best model in the validation set was GBM, which achieved an area-under-the-receiver-operator-curve (AUROC) of 0.74 in the test set. Among the 2236 in the test set, the number of false positives and specificity of GBM vs. neutropenia were 508 vs. 592 and 0.76 vs. 0.72 respectively. Among 139 test set BSIs, six (4.3%) non-neutropenic patients were identified by GBM. All received antibiotics prior to culture result availability. Conclusions We developed a machine learning algorithm to classify BSI. GBM achieved an AUROC of 0.74 and identified 4.3% additional true cases in the test set. The machine learning algorithm did not perform substantially better than using presence of neutropenia alone to predict BSI.http://link.springer.com/article/10.1186/s12885-020-07618-2Machine learningClassifierBloodstream infectionChildrenCancer
collection DOAJ
language English
format Article
sources DOAJ
author Lillian Sung
Conor Corbin
Ethan Steinberg
Emily Vettese
Aaron Campigotto
Loreto Lecce
George A. Tomlinson
Nigam Shah
spellingShingle Lillian Sung
Conor Corbin
Ethan Steinberg
Emily Vettese
Aaron Campigotto
Loreto Lecce
George A. Tomlinson
Nigam Shah
Development and utility assessment of a machine learning bloodstream infection classifier in pediatric patients receiving cancer treatments
BMC Cancer
Machine learning
Classifier
Bloodstream infection
Children
Cancer
author_facet Lillian Sung
Conor Corbin
Ethan Steinberg
Emily Vettese
Aaron Campigotto
Loreto Lecce
George A. Tomlinson
Nigam Shah
author_sort Lillian Sung
title Development and utility assessment of a machine learning bloodstream infection classifier in pediatric patients receiving cancer treatments
title_short Development and utility assessment of a machine learning bloodstream infection classifier in pediatric patients receiving cancer treatments
title_full Development and utility assessment of a machine learning bloodstream infection classifier in pediatric patients receiving cancer treatments
title_fullStr Development and utility assessment of a machine learning bloodstream infection classifier in pediatric patients receiving cancer treatments
title_full_unstemmed Development and utility assessment of a machine learning bloodstream infection classifier in pediatric patients receiving cancer treatments
title_sort development and utility assessment of a machine learning bloodstream infection classifier in pediatric patients receiving cancer treatments
publisher BMC
series BMC Cancer
issn 1471-2407
publishDate 2020-11-01
description Abstract Background Objectives were to build a machine learning algorithm to identify bloodstream infection (BSI) among pediatric patients with cancer and hematopoietic stem cell transplantation (HSCT) recipients, and to compare this approach with presence of neutropenia to identify BSI. Methods We included patients 0–18 years of age at cancer diagnosis or HSCT between January 2009 and November 2018. Eligible blood cultures were those with no previous blood culture (regardless of result) within 7 days. The primary outcome was BSI. Four machine learning algorithms were used: elastic net, support vector machine and two implementations of gradient boosting machine (GBM and XGBoost). Model training and evaluation were performed using temporally disjoint training (60%), validation (20%) and test (20%) sets. The best model was compared to neutropenia alone in the test set. Results Of 11,183 eligible blood cultures, 624 (5.6%) were positive. The best model in the validation set was GBM, which achieved an area-under-the-receiver-operator-curve (AUROC) of 0.74 in the test set. Among the 2236 in the test set, the number of false positives and specificity of GBM vs. neutropenia were 508 vs. 592 and 0.76 vs. 0.72 respectively. Among 139 test set BSIs, six (4.3%) non-neutropenic patients were identified by GBM. All received antibiotics prior to culture result availability. Conclusions We developed a machine learning algorithm to classify BSI. GBM achieved an AUROC of 0.74 and identified 4.3% additional true cases in the test set. The machine learning algorithm did not perform substantially better than using presence of neutropenia alone to predict BSI.
topic Machine learning
Classifier
Bloodstream infection
Children
Cancer
url http://link.springer.com/article/10.1186/s12885-020-07618-2
work_keys_str_mv AT lilliansung developmentandutilityassessmentofamachinelearningbloodstreaminfectionclassifierinpediatricpatientsreceivingcancertreatments
AT conorcorbin developmentandutilityassessmentofamachinelearningbloodstreaminfectionclassifierinpediatricpatientsreceivingcancertreatments
AT ethansteinberg developmentandutilityassessmentofamachinelearningbloodstreaminfectionclassifierinpediatricpatientsreceivingcancertreatments
AT emilyvettese developmentandutilityassessmentofamachinelearningbloodstreaminfectionclassifierinpediatricpatientsreceivingcancertreatments
AT aaroncampigotto developmentandutilityassessmentofamachinelearningbloodstreaminfectionclassifierinpediatricpatientsreceivingcancertreatments
AT loretolecce developmentandutilityassessmentofamachinelearningbloodstreaminfectionclassifierinpediatricpatientsreceivingcancertreatments
AT georgeatomlinson developmentandutilityassessmentofamachinelearningbloodstreaminfectionclassifierinpediatricpatientsreceivingcancertreatments
AT nigamshah developmentandutilityassessmentofamachinelearningbloodstreaminfectionclassifierinpediatricpatientsreceivingcancertreatments
_version_ 1724432972290981888