Deep Learning and NLP For Knowledge Extraction from Laboratory Reports
Introduction Due to the ever-growing volume and complexity of clinical data, it has become a tedious task to extract information from data for secondary uses such as decision support, quality assurance, and outcome analysis. Recently, there have been great advances in Natural Language Processing (N...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Swansea University
2020-12-01
|
Series: | International Journal of Population Data Science |
Online Access: | https://ijpds.org/article/view/1637 |
id |
doaj-588e097f9b4f4f0f8b0e3d79477e5fcc |
---|---|
record_format |
Article |
spelling |
doaj-588e097f9b4f4f0f8b0e3d79477e5fcc2021-02-10T16:41:44ZengSwansea UniversityInternational Journal of Population Data Science2399-49082020-12-015510.23889/ijpds.v5i5.1637Deep Learning and NLP For Knowledge Extraction from Laboratory ReportsBranson Chen0Elham Dolatabadi1Jeffrey Kwong2Mahmoud Azimaee3ICESVector Institute and University of TorontoICES and University of TorontoICES Introduction Due to the ever-growing volume and complexity of clinical data, it has become a tedious task to extract information from data for secondary uses such as decision support, quality assurance, and outcome analysis. Recently, there have been great advances in Natural Language Processing (NLP) approaches that automate knowledge extraction from clinical reports in order to save costs and improve efficiency. Objectives/Approach Our goal is the development of an NLP tool designed to automatically extract and encode clinical information from laboratory reports. This study describes and evaluates our NLP tool on provincial repositories of laboratory tests and results called Ontario Laboratory Information System (OLIS). OLIS is an electronic system that covers >200 labs and stores patients’ current and past test results as patients move through different areas of the healthcare system. Our NLP tool is a modular system of pipelined components including Named Entity Recognition module for extracting mentions of virus and test mentions and inference to combine extracted entities into a meaningful outcome. Results Initial analyses were conducted on a segment of OLIS related to laboratory tests for respiratory viruses. This data included over a million observations corresponding to ~100 Logical Observation Identifiers Names and Codes (LOINC), with >40,000 unique strings. The clinical text was cleaned, tokenized, and parsed using an in-house text algorithm that was continually refined with manual review from clinical experts. This data was then encoded as virus and test types to be used as a ground truth. The NLP tool was built on ground truth data and achieved an accuracy greater than 95%. Conclusion/Implications Approaches like these can be applied to many areas of health research that make use of clinical reports. Our methods, when optimized and validated, can be deployed into clinical systems to provide on-the-spot analysis of various laboratory reports. https://ijpds.org/article/view/1637 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Branson Chen Elham Dolatabadi Jeffrey Kwong Mahmoud Azimaee |
spellingShingle |
Branson Chen Elham Dolatabadi Jeffrey Kwong Mahmoud Azimaee Deep Learning and NLP For Knowledge Extraction from Laboratory Reports International Journal of Population Data Science |
author_facet |
Branson Chen Elham Dolatabadi Jeffrey Kwong Mahmoud Azimaee |
author_sort |
Branson Chen |
title |
Deep Learning and NLP For Knowledge Extraction from Laboratory Reports |
title_short |
Deep Learning and NLP For Knowledge Extraction from Laboratory Reports |
title_full |
Deep Learning and NLP For Knowledge Extraction from Laboratory Reports |
title_fullStr |
Deep Learning and NLP For Knowledge Extraction from Laboratory Reports |
title_full_unstemmed |
Deep Learning and NLP For Knowledge Extraction from Laboratory Reports |
title_sort |
deep learning and nlp for knowledge extraction from laboratory reports |
publisher |
Swansea University |
series |
International Journal of Population Data Science |
issn |
2399-4908 |
publishDate |
2020-12-01 |
description |
Introduction
Due to the ever-growing volume and complexity of clinical data, it has become a tedious task to extract information from data for secondary uses such as decision support, quality assurance, and outcome analysis. Recently, there have been great advances in Natural Language Processing (NLP) approaches that automate knowledge extraction from clinical reports in order to save costs and improve efficiency.
Objectives/Approach
Our goal is the development of an NLP tool designed to automatically extract and encode clinical information from laboratory reports. This study describes and evaluates our NLP tool on provincial repositories of laboratory tests and results called Ontario Laboratory Information System (OLIS). OLIS is an electronic system that covers >200 labs and stores patients’ current and past test results as patients move through different areas of the healthcare system. Our NLP tool is a modular system of pipelined components including Named Entity Recognition module for extracting mentions of virus and test mentions and inference to combine extracted entities into a meaningful outcome.
Results
Initial analyses were conducted on a segment of OLIS related to laboratory tests for respiratory viruses. This data included over a million observations corresponding to ~100 Logical Observation Identifiers Names and Codes (LOINC), with >40,000 unique strings. The clinical text was cleaned, tokenized, and parsed using an in-house text algorithm that was continually refined with manual review from clinical experts. This data was then encoded as virus and test types to be used as a ground truth. The NLP tool was built on ground truth data and achieved an accuracy greater than 95%.
Conclusion/Implications
Approaches like these can be applied to many areas of health research that make use of clinical reports. Our methods, when optimized and validated, can be deployed into clinical systems to provide on-the-spot analysis of various laboratory reports.
|
url |
https://ijpds.org/article/view/1637 |
work_keys_str_mv |
AT bransonchen deeplearningandnlpforknowledgeextractionfromlaboratoryreports AT elhamdolatabadi deeplearningandnlpforknowledgeextractionfromlaboratoryreports AT jeffreykwong deeplearningandnlpforknowledgeextractionfromlaboratoryreports AT mahmoudazimaee deeplearningandnlpforknowledgeextractionfromlaboratoryreports |
_version_ |
1724275231217942528 |