Summary: | 碩士 === 國立成功大學 === 醫學資訊研究所 === 101 === During the last decade, the advent of Ontologies used for biomedical annotation has had a deep impact on life science. MeSH is a well-known Ontology for the purpose of indexing journal articles in PubMed, improving literature searching on multi-domain topics. Since the explosion of data growth in recent years, there are new terms, concepts that weed through the old and bring forth the new. Automatically extending sets of existing terms will enable bio-curators to systematically improve text-based ontologies level by level. However, most of the related techniques which apply symbolic patterns based on a literature corpus tend to focus on more general but not specific parts of the ontology. Therefore, in this work, we present a novel method for utilizing genealogical information from Ontology itself to find suitable siblings for ontology extension. Based on the breadth and depth dimensions, the sibling generation stage and pruning strategy are proposed in our approach. As a result, on the average, the precision of the genealogical-based method achieved 0.5, with the best 0.83 performance of category “Organisms”. We also achieve average precision 0.69 of 229 new terms in MeSH 2013 version. Furthermore, we found that there is an opportunity for extending Ontology by multiple domains, with employing the knowledge from Ontologies of different domains.
|