Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions

Abstract Background Inferring diseases related to the patient’s electronic medical records (EMRs) is of great significance for assisting doctor diagnosis. Several recent prediction methods have shown that deep learning-based methods can learn the deep and complex information contained in EMRs. Howev...

Full description

Bibliographic Details
Main Authors: Tong Wang, Ping Xuan, Zonglin Liu, Tiangang Zhang
Format: Article
Language:English
Published: BMC 2020-06-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-020-03554-x
id doaj-2fd8f9ef37c94a9f9e0383e52ed7e2b3
record_format Article
spelling doaj-2fd8f9ef37c94a9f9e0383e52ed7e2b32020-11-25T04:02:11ZengBMCBMC Bioinformatics1471-21052020-06-0121111610.1186/s12859-020-03554-xAssistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentionsTong Wang0Ping Xuan1Zonglin Liu2Tiangang Zhang3School of Computer Science and Technology, Heilongjiang UniversitySchool of Computer Science and Technology, Heilongjiang UniversitySchool of Computer Science and Technology, Heilongjiang UniversitySchool of Mathematical Science, Heilongjiang UniversityAbstract Background Inferring diseases related to the patient’s electronic medical records (EMRs) is of great significance for assisting doctor diagnosis. Several recent prediction methods have shown that deep learning-based methods can learn the deep and complex information contained in EMRs. However, they do not consider the discriminative contributions of different phrases and words. Moreover, local information and context information of EMRs should be deeply integrated. Results A new method based on the fusion of a convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM) with attention mechanisms is proposed for predicting a disease related to a given EMR, and it is referred to as FCNBLA. FCNBLA deeply integrates local information, context information of the word sequence and more informative phrases and words. A novel framework based on deep learning is developed to learn the local representation, the context representation and the combination representation. The left side of the framework is constructed based on CNN to learn the local representation of adjacent words. The right side of the framework based on BiLSTM focuses on learning the context representation of the word sequence. Not all phrases and words contribute equally to the representation of an EMR meaning. Therefore, we establish the attention mechanisms at the phrase level and word level, and the middle module of the framework learns the combination representation of the enhanced phrases and words. The macro average f-score and accuracy of FCNBLA achieved 91.29 and 92.78%, respectively. Conclusion The experimental results indicate that FCNBLA yields superior performance compared with several state-of-the-art methods. The attention mechanisms and combination representations are also confirmed to be helpful for improving FCNBLA’s prediction performance. Our method is helpful for assisting doctors in diagnosing diseases in patients.http://link.springer.com/article/10.1186/s12859-020-03554-xEMR-related disease predictionConvolutional neural networkBidirectional long short-term memoryAttention at phrase levelAttention at word level
collection DOAJ
language English
format Article
sources DOAJ
author Tong Wang
Ping Xuan
Zonglin Liu
Tiangang Zhang
spellingShingle Tong Wang
Ping Xuan
Zonglin Liu
Tiangang Zhang
Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions
BMC Bioinformatics
EMR-related disease prediction
Convolutional neural network
Bidirectional long short-term memory
Attention at phrase level
Attention at word level
author_facet Tong Wang
Ping Xuan
Zonglin Liu
Tiangang Zhang
author_sort Tong Wang
title Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions
title_short Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions
title_full Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions
title_fullStr Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions
title_full_unstemmed Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions
title_sort assistant diagnosis with chinese electronic medical records based on cnn and bilstm with phrase-level and word-level attentions
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2020-06-01
description Abstract Background Inferring diseases related to the patient’s electronic medical records (EMRs) is of great significance for assisting doctor diagnosis. Several recent prediction methods have shown that deep learning-based methods can learn the deep and complex information contained in EMRs. However, they do not consider the discriminative contributions of different phrases and words. Moreover, local information and context information of EMRs should be deeply integrated. Results A new method based on the fusion of a convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM) with attention mechanisms is proposed for predicting a disease related to a given EMR, and it is referred to as FCNBLA. FCNBLA deeply integrates local information, context information of the word sequence and more informative phrases and words. A novel framework based on deep learning is developed to learn the local representation, the context representation and the combination representation. The left side of the framework is constructed based on CNN to learn the local representation of adjacent words. The right side of the framework based on BiLSTM focuses on learning the context representation of the word sequence. Not all phrases and words contribute equally to the representation of an EMR meaning. Therefore, we establish the attention mechanisms at the phrase level and word level, and the middle module of the framework learns the combination representation of the enhanced phrases and words. The macro average f-score and accuracy of FCNBLA achieved 91.29 and 92.78%, respectively. Conclusion The experimental results indicate that FCNBLA yields superior performance compared with several state-of-the-art methods. The attention mechanisms and combination representations are also confirmed to be helpful for improving FCNBLA’s prediction performance. Our method is helpful for assisting doctors in diagnosing diseases in patients.
topic EMR-related disease prediction
Convolutional neural network
Bidirectional long short-term memory
Attention at phrase level
Attention at word level
url http://link.springer.com/article/10.1186/s12859-020-03554-x
work_keys_str_mv AT tongwang assistantdiagnosiswithchineseelectronicmedicalrecordsbasedoncnnandbilstmwithphraselevelandwordlevelattentions
AT pingxuan assistantdiagnosiswithchineseelectronicmedicalrecordsbasedoncnnandbilstmwithphraselevelandwordlevelattentions
AT zonglinliu assistantdiagnosiswithchineseelectronicmedicalrecordsbasedoncnnandbilstmwithphraselevelandwordlevelattentions
AT tiangangzhang assistantdiagnosiswithchineseelectronicmedicalrecordsbasedoncnnandbilstmwithphraselevelandwordlevelattentions
_version_ 1724444043239227392