Summary: | 碩士 === 元智大學 === 資訊工程學系 === 97 === In our observations, we find that the inequality problem exists in the amount of Web pages of different languages.
For example, the ODP directory contains a large number of English Web pages, but only has a relatively small number of Chinese and Korean Web pages.
However, some Web taxonomies actually contain many Chinese and Korean Web pages than ODP.
Therefore, we plan to use these abundant Web resources to fertilize the content of non-English ODP taxonomies.
Since non-English ODP directories have rare Web pages,
we utilize English ODP directory as an external hierarchical thesaurus to help the construction of non-English ODP directories.
The external cross-lingual hierarchical thesaurus has been employed in a hierarchical catalog integration scheme to construct non-English Web taxonomies.
As shown in our experiments, the construction performance is therefore improved with the cross-lingual hierarchical thesaurus.
|