Hierarchical Catalog Integrate based on the Maximum Entropy Model

碩士 === 元智大學 === 資訊工程學系 === 95 === In many areas, information is organized in catalogs on the Web. Demands of integration two catalogs appear in many applications. These catalogs usually contain a lot of Web documents and have complicated hierarchical structures. Therefore, how to integrate two catal...

Full description

Bibliographic Details
Main Authors: Cheng-Tse Hung, 洪誠澤
Other Authors: Cheng-Zen Yang
Format: Others
Language:zh-TW
Published: 2007
Online Access:http://ndltd.ncl.edu.tw/handle/14842545761557971657
Description
Summary:碩士 === 元智大學 === 資訊工程學系 === 95 === In many areas, information is organized in catalogs on the Web. Demands of integration two catalogs appear in many applications. These catalogs usually contain a lot of Web documents and have complicated hierarchical structures. Therefore, how to integrate two catalogs accurately becomes an important research topic. For the catalog integration problem, past studies mainly focus on flattened catalogs, and only few papers further discuss the integration of hierarchical catalogs. To the best of our survey, no research has discussed the improvement from additional semantic information on hierarchical catalog integration. This thesis presents an enhancement based on the Maximum Entropy (ME) model using the hierarchical thesaurus information embedded in the catalogs and the additional semantic features expanded from an external corpus. Experimental results on real-world catalogs indicate that the proposed approach consistently improves the integration performance.