Predicting the occurrence of surgical site infections using text mining and machine learning.

In this study we propose the use of text mining and machine learning methods to predict and detect Surgical Site Infections (SSIs) using textual descriptions of surgeries and post-operative patients' records, mined from the database of a high complexity University hospital. SSIs are among the m...

Full description

Bibliographic Details
Main Authors: Daniel A da Silva, Carla S Ten Caten, Rodrigo P Dos Santos, Flavio S Fogliatto, Juliana Hsuan
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2019-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0226272
id doaj-01eac79dadfa4371bbaaa4ad2c3e4142
record_format Article
spelling doaj-01eac79dadfa4371bbaaa4ad2c3e41422021-03-03T21:21:05ZengPublic Library of Science (PLoS)PLoS ONE1932-62032019-01-011412e022627210.1371/journal.pone.0226272Predicting the occurrence of surgical site infections using text mining and machine learning.Daniel A da SilvaCarla S Ten CatenRodrigo P Dos SantosFlavio S FogliattoJuliana HsuanIn this study we propose the use of text mining and machine learning methods to predict and detect Surgical Site Infections (SSIs) using textual descriptions of surgeries and post-operative patients' records, mined from the database of a high complexity University hospital. SSIs are among the most common adverse events experienced by hospitalized patients; preventing such events is fundamental to ensure patients' safety. Knowledge on SSI occurrence rates may also be useful in preventing future episodes. We analyzed 15,479 surgery descriptions and post-operative records testing different preprocessing strategies and the following machine learning algorithms: Linear SVC, Logistic Regression, Multinomial Naive Bayes, Nearest Centroid, Random Forest, Stochastic Gradient Descent, and Support Vector Classification (SVC). For prediction purposes, the best result was obtained using the Stochastic Gradient Descent method (79.7% ROC-AUC); for detection, Logistic Regression yielded the best performance (80.6% ROC-AUC).https://doi.org/10.1371/journal.pone.0226272
collection DOAJ
language English
format Article
sources DOAJ
author Daniel A da Silva
Carla S Ten Caten
Rodrigo P Dos Santos
Flavio S Fogliatto
Juliana Hsuan
spellingShingle Daniel A da Silva
Carla S Ten Caten
Rodrigo P Dos Santos
Flavio S Fogliatto
Juliana Hsuan
Predicting the occurrence of surgical site infections using text mining and machine learning.
PLoS ONE
author_facet Daniel A da Silva
Carla S Ten Caten
Rodrigo P Dos Santos
Flavio S Fogliatto
Juliana Hsuan
author_sort Daniel A da Silva
title Predicting the occurrence of surgical site infections using text mining and machine learning.
title_short Predicting the occurrence of surgical site infections using text mining and machine learning.
title_full Predicting the occurrence of surgical site infections using text mining and machine learning.
title_fullStr Predicting the occurrence of surgical site infections using text mining and machine learning.
title_full_unstemmed Predicting the occurrence of surgical site infections using text mining and machine learning.
title_sort predicting the occurrence of surgical site infections using text mining and machine learning.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2019-01-01
description In this study we propose the use of text mining and machine learning methods to predict and detect Surgical Site Infections (SSIs) using textual descriptions of surgeries and post-operative patients' records, mined from the database of a high complexity University hospital. SSIs are among the most common adverse events experienced by hospitalized patients; preventing such events is fundamental to ensure patients' safety. Knowledge on SSI occurrence rates may also be useful in preventing future episodes. We analyzed 15,479 surgery descriptions and post-operative records testing different preprocessing strategies and the following machine learning algorithms: Linear SVC, Logistic Regression, Multinomial Naive Bayes, Nearest Centroid, Random Forest, Stochastic Gradient Descent, and Support Vector Classification (SVC). For prediction purposes, the best result was obtained using the Stochastic Gradient Descent method (79.7% ROC-AUC); for detection, Logistic Regression yielded the best performance (80.6% ROC-AUC).
url https://doi.org/10.1371/journal.pone.0226272
work_keys_str_mv AT danieladasilva predictingtheoccurrenceofsurgicalsiteinfectionsusingtextminingandmachinelearning
AT carlastencaten predictingtheoccurrenceofsurgicalsiteinfectionsusingtextminingandmachinelearning
AT rodrigopdossantos predictingtheoccurrenceofsurgicalsiteinfectionsusingtextminingandmachinelearning
AT flaviosfogliatto predictingtheoccurrenceofsurgicalsiteinfectionsusingtextminingandmachinelearning
AT julianahsuan predictingtheoccurrenceofsurgicalsiteinfectionsusingtextminingandmachinelearning
_version_ 1714817291952062464