Supervised and unsupervised language modelling in Chest X-Ray radiological reports.

Chest radiography (CXR) is the most commonly used imaging modality and deep neural network (DNN) algorithms have shown promise in effective triage of normal and abnormal radiograms. Typically, DNNs require large quantities of expertly labelled training exemplars, which in clinical contexts is a majo...

Full description

Bibliographic Details
Main Authors: Ignat Drozdov, Daniel Forbes, Benjamin Szubert, Mark Hall, Chris Carlin, David J Lowe
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0229963
id doaj-94350471f1ac47d08d8ad3389d25dbac
record_format Article
spelling doaj-94350471f1ac47d08d8ad3389d25dbac2021-03-03T21:35:18ZengPublic Library of Science (PLoS)PLoS ONE1932-62032020-01-01153e022996310.1371/journal.pone.0229963Supervised and unsupervised language modelling in Chest X-Ray radiological reports.Ignat DrozdovDaniel ForbesBenjamin SzubertMark HallChris CarlinDavid J LoweChest radiography (CXR) is the most commonly used imaging modality and deep neural network (DNN) algorithms have shown promise in effective triage of normal and abnormal radiograms. Typically, DNNs require large quantities of expertly labelled training exemplars, which in clinical contexts is a major bottleneck to effective modelling, as both considerable clinical skill and time is required to produce high-quality ground truths. In this work we evaluate thirteen supervised classifiers using two large free-text corpora and demonstrate that bi-directional long short-term memory (BiLSTM) networks with attention mechanism effectively identify Normal, Abnormal, and Unclear CXR reports in internal (n = 965 manually-labelled reports, f1-score = 0.94) and external (n = 465 manually-labelled reports, f1-score = 0.90) testing sets using a relatively small number of expert-labelled training observations (n = 3,856 annotated reports). Furthermore, we introduce a general unsupervised approach that accurately distinguishes Normal and Abnormal CXR reports in a large unlabelled corpus. We anticipate that the results presented in this work can be used to automatically extract standardized clinical information from free-text CXR radiological reports, facilitating the training of clinical decision support systems for CXR triage.https://doi.org/10.1371/journal.pone.0229963
collection DOAJ
language English
format Article
sources DOAJ
author Ignat Drozdov
Daniel Forbes
Benjamin Szubert
Mark Hall
Chris Carlin
David J Lowe
spellingShingle Ignat Drozdov
Daniel Forbes
Benjamin Szubert
Mark Hall
Chris Carlin
David J Lowe
Supervised and unsupervised language modelling in Chest X-Ray radiological reports.
PLoS ONE
author_facet Ignat Drozdov
Daniel Forbes
Benjamin Szubert
Mark Hall
Chris Carlin
David J Lowe
author_sort Ignat Drozdov
title Supervised and unsupervised language modelling in Chest X-Ray radiological reports.
title_short Supervised and unsupervised language modelling in Chest X-Ray radiological reports.
title_full Supervised and unsupervised language modelling in Chest X-Ray radiological reports.
title_fullStr Supervised and unsupervised language modelling in Chest X-Ray radiological reports.
title_full_unstemmed Supervised and unsupervised language modelling in Chest X-Ray radiological reports.
title_sort supervised and unsupervised language modelling in chest x-ray radiological reports.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2020-01-01
description Chest radiography (CXR) is the most commonly used imaging modality and deep neural network (DNN) algorithms have shown promise in effective triage of normal and abnormal radiograms. Typically, DNNs require large quantities of expertly labelled training exemplars, which in clinical contexts is a major bottleneck to effective modelling, as both considerable clinical skill and time is required to produce high-quality ground truths. In this work we evaluate thirteen supervised classifiers using two large free-text corpora and demonstrate that bi-directional long short-term memory (BiLSTM) networks with attention mechanism effectively identify Normal, Abnormal, and Unclear CXR reports in internal (n = 965 manually-labelled reports, f1-score = 0.94) and external (n = 465 manually-labelled reports, f1-score = 0.90) testing sets using a relatively small number of expert-labelled training observations (n = 3,856 annotated reports). Furthermore, we introduce a general unsupervised approach that accurately distinguishes Normal and Abnormal CXR reports in a large unlabelled corpus. We anticipate that the results presented in this work can be used to automatically extract standardized clinical information from free-text CXR radiological reports, facilitating the training of clinical decision support systems for CXR triage.
url https://doi.org/10.1371/journal.pone.0229963
work_keys_str_mv AT ignatdrozdov supervisedandunsupervisedlanguagemodellinginchestxrayradiologicalreports
AT danielforbes supervisedandunsupervisedlanguagemodellinginchestxrayradiologicalreports
AT benjaminszubert supervisedandunsupervisedlanguagemodellinginchestxrayradiologicalreports
AT markhall supervisedandunsupervisedlanguagemodellinginchestxrayradiologicalreports
AT chriscarlin supervisedandunsupervisedlanguagemodellinginchestxrayradiologicalreports
AT davidjlowe supervisedandunsupervisedlanguagemodellinginchestxrayradiologicalreports
_version_ 1714816157003808768