Study on structured method of Chinese MRI report of nasopharyngeal carcinoma

Abstract Background Image text is an important text data in the medical field at it can assist clinicians in making a diagnosis. However, due to the diversity of languages, most descriptions in the image text are unstructured data. The same medical phenomenon may also be described in various ways, s...

Full description

Bibliographic Details
Main Authors: Xin Huang, Hui Chen, Jing-Dong Yan
Format: Article
Language:English
Published: BMC 2021-07-01
Series:BMC Medical Informatics and Decision Making
Subjects:
Online Access:https://doi.org/10.1186/s12911-021-01547-1
Description
Summary:Abstract Background Image text is an important text data in the medical field at it can assist clinicians in making a diagnosis. However, due to the diversity of languages, most descriptions in the image text are unstructured data. The same medical phenomenon may also be described in various ways, such that it remains challenging to conduct text structure analysis. The aim of this research is to develop a feasible approach that can automatically convert nasopharyngeal cancer reports into structured text and build a knowledge network. Methods In this work, we compare commonly used named entity recognition (NER) models, choose the optimal model as our triplet extraction model, and present a Chinese structuring algorithm. Finally, we visualize the results of the algorithm in the form of a knowledge network of nasopharyngeal cancer. Results In NER, both accuracy and recall of the BERT-CRF model reached 99%. The structured extraction rate is 84.74%, and the accuracy is 89.39%. The architecture based on recurrent neural network does not rely on medical dictionaries or word segmentation tools and can realize triplet recognition. Conclusions The BERT-CRF model has high performance in NER, and the triplet can reflect the content of the image report. This work can provide technical support for the construction of a nasopharyngeal cancer database.
ISSN:1472-6947