Supervised and unsupervised language modelling in Chest X-Ray radiological reports.
Chest radiography (CXR) is the most commonly used imaging modality and deep neural network (DNN) algorithms have shown promise in effective triage of normal and abnormal radiograms. Typically, DNNs require large quantities of expertly labelled training exemplars, which in clinical contexts is a majo...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2020-01-01
|
Series: | PLoS ONE |
Online Access: | https://doi.org/10.1371/journal.pone.0229963 |
id |
doaj-94350471f1ac47d08d8ad3389d25dbac |
---|---|
record_format |
Article |
spelling |
doaj-94350471f1ac47d08d8ad3389d25dbac2021-03-03T21:35:18ZengPublic Library of Science (PLoS)PLoS ONE1932-62032020-01-01153e022996310.1371/journal.pone.0229963Supervised and unsupervised language modelling in Chest X-Ray radiological reports.Ignat DrozdovDaniel ForbesBenjamin SzubertMark HallChris CarlinDavid J LoweChest radiography (CXR) is the most commonly used imaging modality and deep neural network (DNN) algorithms have shown promise in effective triage of normal and abnormal radiograms. Typically, DNNs require large quantities of expertly labelled training exemplars, which in clinical contexts is a major bottleneck to effective modelling, as both considerable clinical skill and time is required to produce high-quality ground truths. In this work we evaluate thirteen supervised classifiers using two large free-text corpora and demonstrate that bi-directional long short-term memory (BiLSTM) networks with attention mechanism effectively identify Normal, Abnormal, and Unclear CXR reports in internal (n = 965 manually-labelled reports, f1-score = 0.94) and external (n = 465 manually-labelled reports, f1-score = 0.90) testing sets using a relatively small number of expert-labelled training observations (n = 3,856 annotated reports). Furthermore, we introduce a general unsupervised approach that accurately distinguishes Normal and Abnormal CXR reports in a large unlabelled corpus. We anticipate that the results presented in this work can be used to automatically extract standardized clinical information from free-text CXR radiological reports, facilitating the training of clinical decision support systems for CXR triage.https://doi.org/10.1371/journal.pone.0229963 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Ignat Drozdov Daniel Forbes Benjamin Szubert Mark Hall Chris Carlin David J Lowe |
spellingShingle |
Ignat Drozdov Daniel Forbes Benjamin Szubert Mark Hall Chris Carlin David J Lowe Supervised and unsupervised language modelling in Chest X-Ray radiological reports. PLoS ONE |
author_facet |
Ignat Drozdov Daniel Forbes Benjamin Szubert Mark Hall Chris Carlin David J Lowe |
author_sort |
Ignat Drozdov |
title |
Supervised and unsupervised language modelling in Chest X-Ray radiological reports. |
title_short |
Supervised and unsupervised language modelling in Chest X-Ray radiological reports. |
title_full |
Supervised and unsupervised language modelling in Chest X-Ray radiological reports. |
title_fullStr |
Supervised and unsupervised language modelling in Chest X-Ray radiological reports. |
title_full_unstemmed |
Supervised and unsupervised language modelling in Chest X-Ray radiological reports. |
title_sort |
supervised and unsupervised language modelling in chest x-ray radiological reports. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2020-01-01 |
description |
Chest radiography (CXR) is the most commonly used imaging modality and deep neural network (DNN) algorithms have shown promise in effective triage of normal and abnormal radiograms. Typically, DNNs require large quantities of expertly labelled training exemplars, which in clinical contexts is a major bottleneck to effective modelling, as both considerable clinical skill and time is required to produce high-quality ground truths. In this work we evaluate thirteen supervised classifiers using two large free-text corpora and demonstrate that bi-directional long short-term memory (BiLSTM) networks with attention mechanism effectively identify Normal, Abnormal, and Unclear CXR reports in internal (n = 965 manually-labelled reports, f1-score = 0.94) and external (n = 465 manually-labelled reports, f1-score = 0.90) testing sets using a relatively small number of expert-labelled training observations (n = 3,856 annotated reports). Furthermore, we introduce a general unsupervised approach that accurately distinguishes Normal and Abnormal CXR reports in a large unlabelled corpus. We anticipate that the results presented in this work can be used to automatically extract standardized clinical information from free-text CXR radiological reports, facilitating the training of clinical decision support systems for CXR triage. |
url |
https://doi.org/10.1371/journal.pone.0229963 |
work_keys_str_mv |
AT ignatdrozdov supervisedandunsupervisedlanguagemodellinginchestxrayradiologicalreports AT danielforbes supervisedandunsupervisedlanguagemodellinginchestxrayradiologicalreports AT benjaminszubert supervisedandunsupervisedlanguagemodellinginchestxrayradiologicalreports AT markhall supervisedandunsupervisedlanguagemodellinginchestxrayradiologicalreports AT chriscarlin supervisedandunsupervisedlanguagemodellinginchestxrayradiologicalreports AT davidjlowe supervisedandunsupervisedlanguagemodellinginchestxrayradiologicalreports |
_version_ |
1714816157003808768 |