Summary: | 碩士 === 國立成功大學 === 資訊工程學系碩博士班 === 96 === In recent years the number of biomedical literatures is increased dramatically and the related experts cannot efficiently manage and extract knowledge from literatures so that much useful information would be lost. In order to construct an intelligent biomedical knowledge management system, researchers have proposed many Relation Extraction methods during the last several decades. However, before applying those methods the system has to recognize the name entities in literature and map the entity to the relative concept. Due to the less of conscientious and careful writing style, there are many problems, e.g. term variation and term ambiguity, in the mapping process and they cause error correlation between name entity and concept by the directly mapping method. Thus, the purpose of this study is to automatically and exactly identify the relative concepts mentioned in literatures.
In this study, the influence network weighting strategy is applied to weight the similarity score between the entity and the concept as well as to solve the term variation. The proposed de-ambiguity strategy is used to increase the confidence of concept in literature. Different from previous studies, this study makes a good use of the information in entity and concept to increase the precision of system and makes the identified entity even more meaningful. Results of the experiment, the system using those proposed strategies outperforms the simple strategies and previously proposed methods in biomedical entity normalization. Generally, this study proposes to help the next step of text mining researches, e.g. PPI and Co-occurrence, by normalizing the name entity.
|