Design and Evaluation of Algorithms for Topic Hierarchy Integration

碩士 === 國立中正大學 === 資訊工程研究所 === 90 === In this thesis, we study the problem of integrating documents from different sources into a comprehensive topic hierarchy. Our objective is to develop efficient techniques that improve the accuracy of traditional categorization methods by in...

Full description

Bibliographic Details
Main Authors: Chi-Feng Chang, 張啟峰
Other Authors: Jyh-Jong Tsay
Format: Others
Language:zh-TW
Published: 2002
Online Access:http://ndltd.ncl.edu.tw/handle/02471981001283162993
Description
Summary:碩士 === 國立中正大學 === 資訊工程研究所 === 90 === In this thesis, we study the problem of integrating documents from different sources into a comprehensive topic hierarchy. Our objective is to develop efficient techniques that improve the accuracy of traditional categorization methods by incorporating categorization information provided by data sources into categorization process. Notice that in the World-Wide Web, categorization information is often available from information sources. For example, news from newspapers, books from publishers, items from electronic commercial sites, or even web pages archived by web information portals are categorized. Observe that many of the topic hierarchies adopted by current information sources are highly related. We believe that categorization information can be used to improve classification accuracy. We present several techniques that explore relations between topic hierarchies and incorporate categorization information from source hierarchies into traditional classification methods such as Baysian methods and support vector machines. Experiment on collections from Openfind and Yam, and Google and Yahoo, well-known popular web sites in Taiwan and USA, respectively, shows that incorporating categorization information from source hierarchies can significantly improve the classification accuracy.