Study on structured method of Chinese MRI report of nasopharyngeal carcinoma

Abstract Background Image text is an important text data in the medical field at it can assist clinicians in making a diagnosis. However, due to the diversity of languages, most descriptions in the image text are unstructured data. The same medical phenomenon may also be described in various ways, s...

Full description

Bibliographic Details
Main Authors: Xin Huang, Hui Chen, Jing-Dong Yan
Format: Article
Language:English
Published: BMC 2021-07-01
Series:BMC Medical Informatics and Decision Making
Subjects:
Online Access:https://doi.org/10.1186/s12911-021-01547-1
id doaj-f5ee25f2039c455f9deb4441975c951b
record_format Article
spelling doaj-f5ee25f2039c455f9deb4441975c951b2021-08-01T11:32:09ZengBMCBMC Medical Informatics and Decision Making1472-69472021-07-0121S211010.1186/s12911-021-01547-1Study on structured method of Chinese MRI report of nasopharyngeal carcinomaXin Huang0Hui Chen1Jing-Dong Yan2School of Biomedical Engineering, Southern Medical UniversityShuguang Hospital, Shanghai University of Traditional Chinese MedicineNanfang Hospital, Southern Medical UniversityAbstract Background Image text is an important text data in the medical field at it can assist clinicians in making a diagnosis. However, due to the diversity of languages, most descriptions in the image text are unstructured data. The same medical phenomenon may also be described in various ways, such that it remains challenging to conduct text structure analysis. The aim of this research is to develop a feasible approach that can automatically convert nasopharyngeal cancer reports into structured text and build a knowledge network. Methods In this work, we compare commonly used named entity recognition (NER) models, choose the optimal model as our triplet extraction model, and present a Chinese structuring algorithm. Finally, we visualize the results of the algorithm in the form of a knowledge network of nasopharyngeal cancer. Results In NER, both accuracy and recall of the BERT-CRF model reached 99%. The structured extraction rate is 84.74%, and the accuracy is 89.39%. The architecture based on recurrent neural network does not rely on medical dictionaries or word segmentation tools and can realize triplet recognition. Conclusions The BERT-CRF model has high performance in NER, and the triplet can reflect the content of the image report. This work can provide technical support for the construction of a nasopharyngeal cancer database.https://doi.org/10.1186/s12911-021-01547-1Structured medical textNamed entity recognitionKnowledge network
collection DOAJ
language English
format Article
sources DOAJ
author Xin Huang
Hui Chen
Jing-Dong Yan
spellingShingle Xin Huang
Hui Chen
Jing-Dong Yan
Study on structured method of Chinese MRI report of nasopharyngeal carcinoma
BMC Medical Informatics and Decision Making
Structured medical text
Named entity recognition
Knowledge network
author_facet Xin Huang
Hui Chen
Jing-Dong Yan
author_sort Xin Huang
title Study on structured method of Chinese MRI report of nasopharyngeal carcinoma
title_short Study on structured method of Chinese MRI report of nasopharyngeal carcinoma
title_full Study on structured method of Chinese MRI report of nasopharyngeal carcinoma
title_fullStr Study on structured method of Chinese MRI report of nasopharyngeal carcinoma
title_full_unstemmed Study on structured method of Chinese MRI report of nasopharyngeal carcinoma
title_sort study on structured method of chinese mri report of nasopharyngeal carcinoma
publisher BMC
series BMC Medical Informatics and Decision Making
issn 1472-6947
publishDate 2021-07-01
description Abstract Background Image text is an important text data in the medical field at it can assist clinicians in making a diagnosis. However, due to the diversity of languages, most descriptions in the image text are unstructured data. The same medical phenomenon may also be described in various ways, such that it remains challenging to conduct text structure analysis. The aim of this research is to develop a feasible approach that can automatically convert nasopharyngeal cancer reports into structured text and build a knowledge network. Methods In this work, we compare commonly used named entity recognition (NER) models, choose the optimal model as our triplet extraction model, and present a Chinese structuring algorithm. Finally, we visualize the results of the algorithm in the form of a knowledge network of nasopharyngeal cancer. Results In NER, both accuracy and recall of the BERT-CRF model reached 99%. The structured extraction rate is 84.74%, and the accuracy is 89.39%. The architecture based on recurrent neural network does not rely on medical dictionaries or word segmentation tools and can realize triplet recognition. Conclusions The BERT-CRF model has high performance in NER, and the triplet can reflect the content of the image report. This work can provide technical support for the construction of a nasopharyngeal cancer database.
topic Structured medical text
Named entity recognition
Knowledge network
url https://doi.org/10.1186/s12911-021-01547-1
work_keys_str_mv AT xinhuang studyonstructuredmethodofchinesemrireportofnasopharyngealcarcinoma
AT huichen studyonstructuredmethodofchinesemrireportofnasopharyngealcarcinoma
AT jingdongyan studyonstructuredmethodofchinesemrireportofnasopharyngealcarcinoma
_version_ 1721245842663276544