Unsupervised Discovery of Named Entities with Fine-Grained Category on the Web

碩士 === 國立清華大學 === 資訊系統與應用研究所 === 93 === We introduce a method for finding named entities (NEs) with the same category as a given set of seed named entities on the Web. In our approach, passages containing the given seed NEs are retrieved from the Web and subsequently used to construct linguistic mod...

Full description

Bibliographic Details
Main Authors: Cheng-Han Chiang, 江政韓
Other Authors: Jason S. Chang
Format: Others
Language:en_US
Online Access:http://ndltd.ncl.edu.tw/handle/59000480779565169837
Description
Summary:碩士 === 國立清華大學 === 資訊系統與應用研究所 === 93 === We introduce a method for finding named entities (NEs) with the same category as a given set of seed named entities on the Web. In our approach, passages containing the given seed NEs are retrieved from the Web and subsequently used to construct linguistic model aimed at discovering more new NEs with the same category from the Web. The method involves generating a key terms table with word classes from Webpage summaries containing the seed NEs and learning surface patterns containing the seed NEs from these passages. At runtime, we use salient key terms and word classes in the model to find the new Web summaries, filter out unlikely passages and extract the new NEs from the remaining passages using surface patterns. We presented a prototype system, Name Finder, which applies the proposed method to discover additional NEs for a set of given several NEs. We evaluate and compare Name Finder with Google Sets. The experimental results show that our system produces more NEs with an average precision rate comparable with Google Sets. Our methodology cleanly supports automatic knowledge discovery and ontology extension.