Study on structured method of Chinese MRI report of nasopharyngeal carcinoma
Abstract Background Image text is an important text data in the medical field at it can assist clinicians in making a diagnosis. However, due to the diversity of languages, most descriptions in the image text are unstructured data. The same medical phenomenon may also be described in various ways, s...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2021-07-01
|
Series: | BMC Medical Informatics and Decision Making |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12911-021-01547-1 |
id |
doaj-f5ee25f2039c455f9deb4441975c951b |
---|---|
record_format |
Article |
spelling |
doaj-f5ee25f2039c455f9deb4441975c951b2021-08-01T11:32:09ZengBMCBMC Medical Informatics and Decision Making1472-69472021-07-0121S211010.1186/s12911-021-01547-1Study on structured method of Chinese MRI report of nasopharyngeal carcinomaXin Huang0Hui Chen1Jing-Dong Yan2School of Biomedical Engineering, Southern Medical UniversityShuguang Hospital, Shanghai University of Traditional Chinese MedicineNanfang Hospital, Southern Medical UniversityAbstract Background Image text is an important text data in the medical field at it can assist clinicians in making a diagnosis. However, due to the diversity of languages, most descriptions in the image text are unstructured data. The same medical phenomenon may also be described in various ways, such that it remains challenging to conduct text structure analysis. The aim of this research is to develop a feasible approach that can automatically convert nasopharyngeal cancer reports into structured text and build a knowledge network. Methods In this work, we compare commonly used named entity recognition (NER) models, choose the optimal model as our triplet extraction model, and present a Chinese structuring algorithm. Finally, we visualize the results of the algorithm in the form of a knowledge network of nasopharyngeal cancer. Results In NER, both accuracy and recall of the BERT-CRF model reached 99%. The structured extraction rate is 84.74%, and the accuracy is 89.39%. The architecture based on recurrent neural network does not rely on medical dictionaries or word segmentation tools and can realize triplet recognition. Conclusions The BERT-CRF model has high performance in NER, and the triplet can reflect the content of the image report. This work can provide technical support for the construction of a nasopharyngeal cancer database.https://doi.org/10.1186/s12911-021-01547-1Structured medical textNamed entity recognitionKnowledge network |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xin Huang Hui Chen Jing-Dong Yan |
spellingShingle |
Xin Huang Hui Chen Jing-Dong Yan Study on structured method of Chinese MRI report of nasopharyngeal carcinoma BMC Medical Informatics and Decision Making Structured medical text Named entity recognition Knowledge network |
author_facet |
Xin Huang Hui Chen Jing-Dong Yan |
author_sort |
Xin Huang |
title |
Study on structured method of Chinese MRI report of nasopharyngeal carcinoma |
title_short |
Study on structured method of Chinese MRI report of nasopharyngeal carcinoma |
title_full |
Study on structured method of Chinese MRI report of nasopharyngeal carcinoma |
title_fullStr |
Study on structured method of Chinese MRI report of nasopharyngeal carcinoma |
title_full_unstemmed |
Study on structured method of Chinese MRI report of nasopharyngeal carcinoma |
title_sort |
study on structured method of chinese mri report of nasopharyngeal carcinoma |
publisher |
BMC |
series |
BMC Medical Informatics and Decision Making |
issn |
1472-6947 |
publishDate |
2021-07-01 |
description |
Abstract Background Image text is an important text data in the medical field at it can assist clinicians in making a diagnosis. However, due to the diversity of languages, most descriptions in the image text are unstructured data. The same medical phenomenon may also be described in various ways, such that it remains challenging to conduct text structure analysis. The aim of this research is to develop a feasible approach that can automatically convert nasopharyngeal cancer reports into structured text and build a knowledge network. Methods In this work, we compare commonly used named entity recognition (NER) models, choose the optimal model as our triplet extraction model, and present a Chinese structuring algorithm. Finally, we visualize the results of the algorithm in the form of a knowledge network of nasopharyngeal cancer. Results In NER, both accuracy and recall of the BERT-CRF model reached 99%. The structured extraction rate is 84.74%, and the accuracy is 89.39%. The architecture based on recurrent neural network does not rely on medical dictionaries or word segmentation tools and can realize triplet recognition. Conclusions The BERT-CRF model has high performance in NER, and the triplet can reflect the content of the image report. This work can provide technical support for the construction of a nasopharyngeal cancer database. |
topic |
Structured medical text Named entity recognition Knowledge network |
url |
https://doi.org/10.1186/s12911-021-01547-1 |
work_keys_str_mv |
AT xinhuang studyonstructuredmethodofchinesemrireportofnasopharyngealcarcinoma AT huichen studyonstructuredmethodofchinesemrireportofnasopharyngealcarcinoma AT jingdongyan studyonstructuredmethodofchinesemrireportofnasopharyngealcarcinoma |
_version_ |
1721245842663276544 |