Extracting Symptoms from Narrative Text using Artificial Intelligence

Indiana University-Purdue University Indianapolis (IUPUI) === Electronic health records collect an enormous amount of data about patients. However, the information about the patient’s illness is stored in progress notes that are in an un- structured format. It is difficult for humans to annotate sym...

Full description

Bibliographic Details
Main Author:	Gandhi, Priyanka
Other Authors:	Zou, Xukai
Language:	en_US
Published:	2021
Subjects:	Artificial Intelligence Neural Network Machine Learning Medical Dataset
Online Access:	http://hdl.handle.net/1805/24759

id	ndltd-IUPUI-oai-scholarworks.iupui.edu-1805-24759
record_format	oai_dc
spelling	ndltd-IUPUI-oai-scholarworks.iupui.edu-1805-247592021-01-28T05:08:16Z Extracting Symptoms from Narrative Text using Artificial Intelligence Gandhi, Priyanka Zou, Xukai Luo, Xiao Xia, Yuni Artificial Intelligence Neural Network Machine Learning Medical Dataset Indiana University-Purdue University Indianapolis (IUPUI) Electronic health records collect an enormous amount of data about patients. However, the information about the patient’s illness is stored in progress notes that are in an un- structured format. It is difficult for humans to annotate symptoms listed in the free text. Recently, researchers have explored the advancements of deep learning can be applied to pro- cess biomedical data. The information in the text can be extracted with the help of natural language processing. The research presented in this thesis aims at automating the process of symptom extraction. The proposed methods use pre-trained word embeddings such as BioWord2Vec, BERT, and BioBERT to generate vectors of the words based on semantics and syntactic structure of sentences. BioWord2Vec embeddings are fed into a BiLSTM neural network with a CRF layer to capture the dependencies between the co-related terms in the sentence. The pre-trained BERT and BioBERT embeddings are fed into the BERT model with a CRF layer to analyze the output tags of neighboring tokens. The research shows that with the help of the CRF layer in neural network models, longer phrases of symptoms can be extracted from the text. The proposed models are compared with the UMLS Metamap tool that uses various sources to categorize the terms in the text to different semantic types and Stanford CoreNLP, a dependency parser, that analyses syntactic relations in the sentence to extract information. The performance of the models is analyzed by using strict, relaxed, and n-gram evaluation schemes. The results show BioBERT with a CRF layer can extract the majority of the human-labeled symptoms. Furthermore, the model is used to extract symptoms from COVID-19 tweets. The model was able to extract symptoms listed by CDC as well as new symptoms. 2021-01-05T18:36:06Z 2021-01-05T18:36:06Z 2020-12 Thesis http://hdl.handle.net/1805/24759 en_US
collection	NDLTD
language	en_US
sources	NDLTD
topic	Artificial Intelligence Neural Network Machine Learning Medical Dataset
spellingShingle	Artificial Intelligence Neural Network Machine Learning Medical Dataset Gandhi, Priyanka Extracting Symptoms from Narrative Text using Artificial Intelligence
description	Indiana University-Purdue University Indianapolis (IUPUI) === Electronic health records collect an enormous amount of data about patients. However, the information about the patient’s illness is stored in progress notes that are in an un- structured format. It is difficult for humans to annotate symptoms listed in the free text. Recently, researchers have explored the advancements of deep learning can be applied to pro- cess biomedical data. The information in the text can be extracted with the help of natural language processing. The research presented in this thesis aims at automating the process of symptom extraction. The proposed methods use pre-trained word embeddings such as BioWord2Vec, BERT, and BioBERT to generate vectors of the words based on semantics and syntactic structure of sentences. BioWord2Vec embeddings are fed into a BiLSTM neural network with a CRF layer to capture the dependencies between the co-related terms in the sentence. The pre-trained BERT and BioBERT embeddings are fed into the BERT model with a CRF layer to analyze the output tags of neighboring tokens. The research shows that with the help of the CRF layer in neural network models, longer phrases of symptoms can be extracted from the text. The proposed models are compared with the UMLS Metamap tool that uses various sources to categorize the terms in the text to different semantic types and Stanford CoreNLP, a dependency parser, that analyses syntactic relations in the sentence to extract information. The performance of the models is analyzed by using strict, relaxed, and n-gram evaluation schemes. The results show BioBERT with a CRF layer can extract the majority of the human-labeled symptoms. Furthermore, the model is used to extract symptoms from COVID-19 tweets. The model was able to extract symptoms listed by CDC as well as new symptoms.
author2	Zou, Xukai
author_facet	Zou, Xukai Gandhi, Priyanka
author	Gandhi, Priyanka
author_sort	Gandhi, Priyanka
title	Extracting Symptoms from Narrative Text using Artificial Intelligence
title_short	Extracting Symptoms from Narrative Text using Artificial Intelligence
title_full	Extracting Symptoms from Narrative Text using Artificial Intelligence
title_fullStr	Extracting Symptoms from Narrative Text using Artificial Intelligence
title_full_unstemmed	Extracting Symptoms from Narrative Text using Artificial Intelligence
title_sort	extracting symptoms from narrative text using artificial intelligence
publishDate	2021
url	http://hdl.handle.net/1805/24759
work_keys_str_mv	AT gandhipriyanka extractingsymptomsfromnarrativetextusingartificialintelligence
_version_	1719374470389432320

Extracting Symptoms from Narrative Text using Artificial Intelligence

Similar Items