Summary: | 碩士 === 國立清華大學 === 資訊系統與應用研究所 === 93 === We introduce a method for finding named entities (NEs) with the same category as a given set of seed named entities on the Web. In our approach, passages containing the given seed NEs are retrieved from the Web and subsequently used to construct linguistic model aimed at discovering more new NEs with the same category from the Web.
The method involves generating a key terms table with word classes from Webpage summaries containing the seed NEs and learning surface patterns containing the seed NEs from these passages. At runtime, we use salient key terms and word classes in the model to find the new Web summaries, filter out unlikely passages and extract the new NEs from the remaining passages using surface patterns.
We presented a prototype system, Name Finder, which applies the proposed method to discover additional NEs for a set of given several NEs. We evaluate and compare Name Finder with Google Sets. The experimental results show that our system produces more NEs with an average precision rate comparable with Google Sets. Our methodology cleanly supports automatic knowledge discovery and ontology extension.
|