Deep Learning and NLP For Knowledge Extraction from Laboratory Reports

Introduction Due to the ever-growing volume and complexity of clinical data, it has become a tedious task to extract information from data for secondary uses such as decision support, quality assurance, and outcome analysis. Recently, there have been great advances in Natural Language Processing (N...

Full description

Bibliographic Details
Main Authors: Branson Chen, Elham Dolatabadi, Jeffrey Kwong, Mahmoud Azimaee
Format: Article
Language:English
Published: Swansea University 2020-12-01
Series:International Journal of Population Data Science
Online Access:https://ijpds.org/article/view/1637
id doaj-588e097f9b4f4f0f8b0e3d79477e5fcc
record_format Article
spelling doaj-588e097f9b4f4f0f8b0e3d79477e5fcc2021-02-10T16:41:44ZengSwansea UniversityInternational Journal of Population Data Science2399-49082020-12-015510.23889/ijpds.v5i5.1637Deep Learning and NLP For Knowledge Extraction from Laboratory ReportsBranson Chen0Elham Dolatabadi1Jeffrey Kwong2Mahmoud Azimaee3ICESVector Institute and University of TorontoICES and University of TorontoICES Introduction Due to the ever-growing volume and complexity of clinical data, it has become a tedious task to extract information from data for secondary uses such as decision support, quality assurance, and outcome analysis. Recently, there have been great advances in Natural Language Processing (NLP) approaches that automate knowledge extraction from clinical reports in order to save costs and improve efficiency. Objectives/Approach​ Our goal is the development of an NLP tool designed to automatically extract and encode clinical information from laboratory reports. This study describes and evaluates our NLP tool on provincial repositories of laboratory tests and results called Ontario Laboratory Information System (OLIS). OLIS is an electronic system that covers >200 labs and stores patients’ current and past test results as patients move through different areas of the healthcare system. Our NLP tool is a modular system of pipelined components including Named Entity Recognition module for extracting mentions of virus and test mentions and inference to combine extracted entities into a meaningful outcome. Results ​Initial analyses were conducted on a segment of OLIS related to laboratory tests for respiratory viruses. This data included over a million observations corresponding to ~100 Logical Observation Identifiers Names and Codes (LOINC), with >40,000 unique strings. The clinical text was cleaned, tokenized, and parsed using an in-house text algorithm that was continually refined with manual review from clinical experts. This data was then encoded as virus and test types to be used as a ground truth. The NLP tool was built on ground truth data and achieved an accuracy greater than 95%. Conclusion/Implications Approaches like these can be applied to many areas of health research that make use of clinical reports. Our methods, when optimized and validated, can be deployed into clinical systems to provide on-the-spot analysis of various laboratory reports. https://ijpds.org/article/view/1637
collection DOAJ
language English
format Article
sources DOAJ
author Branson Chen
Elham Dolatabadi
Jeffrey Kwong
Mahmoud Azimaee
spellingShingle Branson Chen
Elham Dolatabadi
Jeffrey Kwong
Mahmoud Azimaee
Deep Learning and NLP For Knowledge Extraction from Laboratory Reports
International Journal of Population Data Science
author_facet Branson Chen
Elham Dolatabadi
Jeffrey Kwong
Mahmoud Azimaee
author_sort Branson Chen
title Deep Learning and NLP For Knowledge Extraction from Laboratory Reports
title_short Deep Learning and NLP For Knowledge Extraction from Laboratory Reports
title_full Deep Learning and NLP For Knowledge Extraction from Laboratory Reports
title_fullStr Deep Learning and NLP For Knowledge Extraction from Laboratory Reports
title_full_unstemmed Deep Learning and NLP For Knowledge Extraction from Laboratory Reports
title_sort deep learning and nlp for knowledge extraction from laboratory reports
publisher Swansea University
series International Journal of Population Data Science
issn 2399-4908
publishDate 2020-12-01
description Introduction Due to the ever-growing volume and complexity of clinical data, it has become a tedious task to extract information from data for secondary uses such as decision support, quality assurance, and outcome analysis. Recently, there have been great advances in Natural Language Processing (NLP) approaches that automate knowledge extraction from clinical reports in order to save costs and improve efficiency. Objectives/Approach​ Our goal is the development of an NLP tool designed to automatically extract and encode clinical information from laboratory reports. This study describes and evaluates our NLP tool on provincial repositories of laboratory tests and results called Ontario Laboratory Information System (OLIS). OLIS is an electronic system that covers >200 labs and stores patients’ current and past test results as patients move through different areas of the healthcare system. Our NLP tool is a modular system of pipelined components including Named Entity Recognition module for extracting mentions of virus and test mentions and inference to combine extracted entities into a meaningful outcome. Results ​Initial analyses were conducted on a segment of OLIS related to laboratory tests for respiratory viruses. This data included over a million observations corresponding to ~100 Logical Observation Identifiers Names and Codes (LOINC), with >40,000 unique strings. The clinical text was cleaned, tokenized, and parsed using an in-house text algorithm that was continually refined with manual review from clinical experts. This data was then encoded as virus and test types to be used as a ground truth. The NLP tool was built on ground truth data and achieved an accuracy greater than 95%. Conclusion/Implications Approaches like these can be applied to many areas of health research that make use of clinical reports. Our methods, when optimized and validated, can be deployed into clinical systems to provide on-the-spot analysis of various laboratory reports.
url https://ijpds.org/article/view/1637
work_keys_str_mv AT bransonchen deeplearningandnlpforknowledgeextractionfromlaboratoryreports
AT elhamdolatabadi deeplearningandnlpforknowledgeextractionfromlaboratoryreports
AT jeffreykwong deeplearningandnlpforknowledgeextractionfromlaboratoryreports
AT mahmoudazimaee deeplearningandnlpforknowledgeextractionfromlaboratoryreports
_version_ 1724275231217942528