Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions
Abstract Background Inferring diseases related to the patient’s electronic medical records (EMRs) is of great significance for assisting doctor diagnosis. Several recent prediction methods have shown that deep learning-based methods can learn the deep and complex information contained in EMRs. Howev...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2020-06-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12859-020-03554-x |
id |
doaj-2fd8f9ef37c94a9f9e0383e52ed7e2b3 |
---|---|
record_format |
Article |
spelling |
doaj-2fd8f9ef37c94a9f9e0383e52ed7e2b32020-11-25T04:02:11ZengBMCBMC Bioinformatics1471-21052020-06-0121111610.1186/s12859-020-03554-xAssistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentionsTong Wang0Ping Xuan1Zonglin Liu2Tiangang Zhang3School of Computer Science and Technology, Heilongjiang UniversitySchool of Computer Science and Technology, Heilongjiang UniversitySchool of Computer Science and Technology, Heilongjiang UniversitySchool of Mathematical Science, Heilongjiang UniversityAbstract Background Inferring diseases related to the patient’s electronic medical records (EMRs) is of great significance for assisting doctor diagnosis. Several recent prediction methods have shown that deep learning-based methods can learn the deep and complex information contained in EMRs. However, they do not consider the discriminative contributions of different phrases and words. Moreover, local information and context information of EMRs should be deeply integrated. Results A new method based on the fusion of a convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM) with attention mechanisms is proposed for predicting a disease related to a given EMR, and it is referred to as FCNBLA. FCNBLA deeply integrates local information, context information of the word sequence and more informative phrases and words. A novel framework based on deep learning is developed to learn the local representation, the context representation and the combination representation. The left side of the framework is constructed based on CNN to learn the local representation of adjacent words. The right side of the framework based on BiLSTM focuses on learning the context representation of the word sequence. Not all phrases and words contribute equally to the representation of an EMR meaning. Therefore, we establish the attention mechanisms at the phrase level and word level, and the middle module of the framework learns the combination representation of the enhanced phrases and words. The macro average f-score and accuracy of FCNBLA achieved 91.29 and 92.78%, respectively. Conclusion The experimental results indicate that FCNBLA yields superior performance compared with several state-of-the-art methods. The attention mechanisms and combination representations are also confirmed to be helpful for improving FCNBLA’s prediction performance. Our method is helpful for assisting doctors in diagnosing diseases in patients.http://link.springer.com/article/10.1186/s12859-020-03554-xEMR-related disease predictionConvolutional neural networkBidirectional long short-term memoryAttention at phrase levelAttention at word level |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Tong Wang Ping Xuan Zonglin Liu Tiangang Zhang |
spellingShingle |
Tong Wang Ping Xuan Zonglin Liu Tiangang Zhang Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions BMC Bioinformatics EMR-related disease prediction Convolutional neural network Bidirectional long short-term memory Attention at phrase level Attention at word level |
author_facet |
Tong Wang Ping Xuan Zonglin Liu Tiangang Zhang |
author_sort |
Tong Wang |
title |
Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions |
title_short |
Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions |
title_full |
Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions |
title_fullStr |
Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions |
title_full_unstemmed |
Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions |
title_sort |
assistant diagnosis with chinese electronic medical records based on cnn and bilstm with phrase-level and word-level attentions |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2020-06-01 |
description |
Abstract Background Inferring diseases related to the patient’s electronic medical records (EMRs) is of great significance for assisting doctor diagnosis. Several recent prediction methods have shown that deep learning-based methods can learn the deep and complex information contained in EMRs. However, they do not consider the discriminative contributions of different phrases and words. Moreover, local information and context information of EMRs should be deeply integrated. Results A new method based on the fusion of a convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM) with attention mechanisms is proposed for predicting a disease related to a given EMR, and it is referred to as FCNBLA. FCNBLA deeply integrates local information, context information of the word sequence and more informative phrases and words. A novel framework based on deep learning is developed to learn the local representation, the context representation and the combination representation. The left side of the framework is constructed based on CNN to learn the local representation of adjacent words. The right side of the framework based on BiLSTM focuses on learning the context representation of the word sequence. Not all phrases and words contribute equally to the representation of an EMR meaning. Therefore, we establish the attention mechanisms at the phrase level and word level, and the middle module of the framework learns the combination representation of the enhanced phrases and words. The macro average f-score and accuracy of FCNBLA achieved 91.29 and 92.78%, respectively. Conclusion The experimental results indicate that FCNBLA yields superior performance compared with several state-of-the-art methods. The attention mechanisms and combination representations are also confirmed to be helpful for improving FCNBLA’s prediction performance. Our method is helpful for assisting doctors in diagnosing diseases in patients. |
topic |
EMR-related disease prediction Convolutional neural network Bidirectional long short-term memory Attention at phrase level Attention at word level |
url |
http://link.springer.com/article/10.1186/s12859-020-03554-x |
work_keys_str_mv |
AT tongwang assistantdiagnosiswithchineseelectronicmedicalrecordsbasedoncnnandbilstmwithphraselevelandwordlevelattentions AT pingxuan assistantdiagnosiswithchineseelectronicmedicalrecordsbasedoncnnandbilstmwithphraselevelandwordlevelattentions AT zonglinliu assistantdiagnosiswithchineseelectronicmedicalrecordsbasedoncnnandbilstmwithphraselevelandwordlevelattentions AT tiangangzhang assistantdiagnosiswithchineseelectronicmedicalrecordsbasedoncnnandbilstmwithphraselevelandwordlevelattentions |
_version_ |
1724444043239227392 |