Web Taxonomy Construction using a Cross-lingual Hierarchical Thesaurus

碩士 === 元智大學 === 資訊工程學系 === 97 === In our observations, we find that the inequality problem exists in the amount of Web pages of different languages. For example, the ODP directory contains a large number of English Web pages, but only has a relatively small number of Chinese and Korean Web pages. Ho...

Full description

Bibliographic Details
Main Authors: Cheng-Yu Chen, 陳政瑜
Other Authors: Cheng-Zen Yang
Format: Others
Language:zh-TW
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/49294977700724640062
Description
Summary:碩士 === 元智大學 === 資訊工程學系 === 97 === In our observations, we find that the inequality problem exists in the amount of Web pages of different languages. For example, the ODP directory contains a large number of English Web pages, but only has a relatively small number of Chinese and Korean Web pages. However, some Web taxonomies actually contain many Chinese and Korean Web pages than ODP. Therefore, we plan to use these abundant Web resources to fertilize the content of non-English ODP taxonomies. Since non-English ODP directories have rare Web pages, we utilize English ODP directory as an external hierarchical thesaurus to help the construction of non-English ODP directories. The external cross-lingual hierarchical thesaurus has been employed in a hierarchical catalog integration scheme to construct non-English Web taxonomies. As shown in our experiments, the construction performance is therefore improved with the cross-lingual hierarchical thesaurus.