Cross-Language Encyclopedia Article Linking

博士 === 國立臺灣大學 === 資訊工程學研究所 === 103 === Online encyclopedias, like Wikipedia, are one of the most widely used internet services around the world. Though Wikipedia has many language editions, their coverage is imbalanced when compared to the number of language users both online and offline. Furthermor...

Full description

Bibliographic Details
Main Authors: Yu-Chun Wang, 王昱鈞
Other Authors: Jieh Hsiang
Format: Others
Language:en_US
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/00630180009911971519
id ndltd-TW-103NTU05392099
record_format oai_dc
spelling ndltd-TW-103NTU053920992016-11-19T04:09:55Z http://ndltd.ncl.edu.tw/handle/00630180009911971519 Cross-Language Encyclopedia Article Linking 跨語言線上百科連結 Yu-Chun Wang 王昱鈞 博士 國立臺灣大學 資訊工程學研究所 103 Online encyclopedias, like Wikipedia, are one of the most widely used internet services around the world. Though Wikipedia has many language editions, their coverage is imbalanced when compared to the number of language users both online and offline. Furthermore, large alternative online encyclopedias exist for some languages, such as Chinese Baidu Baike. We could improve access to the knowledge in these various sources by constructing and integrating multiple online encyclopedias into large multilingual knowledge bases. The main task in such a project is creating links between articles in different encyclopedias in different languages. Most research to date has focused on linking articles in the different language editions of Wikipedia, yet little work has been done in linking other platform encyclopedias. In this thesis, we develop a method for cross-language encyclopedia article linking (CLEAL) between encyclopedias on different platforms, English Wikipedia and Chinese Baidu Baike. We use a bilingual topic model and translation features based on an SVM model to link articles between these two encyclopedias. To evaluate our approach, we compile datasets from Baidu Baike articles and their corresponding En Wikipedia articles. The evaluation results show that our approach achieves 0.8252 in MRR, outperforming the baseline system by 0.1745 (+26.82%). Our method does not heavily depend on specific platform formats or linguistic characteristics, so it could be easily extended to generate cross-language article links among other online encyclopedias in other languages and on other platforms. Jieh Hsiang 項潔 2015 學位論文 ; thesis 114 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立臺灣大學 === 資訊工程學研究所 === 103 === Online encyclopedias, like Wikipedia, are one of the most widely used internet services around the world. Though Wikipedia has many language editions, their coverage is imbalanced when compared to the number of language users both online and offline. Furthermore, large alternative online encyclopedias exist for some languages, such as Chinese Baidu Baike. We could improve access to the knowledge in these various sources by constructing and integrating multiple online encyclopedias into large multilingual knowledge bases. The main task in such a project is creating links between articles in different encyclopedias in different languages. Most research to date has focused on linking articles in the different language editions of Wikipedia, yet little work has been done in linking other platform encyclopedias. In this thesis, we develop a method for cross-language encyclopedia article linking (CLEAL) between encyclopedias on different platforms, English Wikipedia and Chinese Baidu Baike. We use a bilingual topic model and translation features based on an SVM model to link articles between these two encyclopedias. To evaluate our approach, we compile datasets from Baidu Baike articles and their corresponding En Wikipedia articles. The evaluation results show that our approach achieves 0.8252 in MRR, outperforming the baseline system by 0.1745 (+26.82%). Our method does not heavily depend on specific platform formats or linguistic characteristics, so it could be easily extended to generate cross-language article links among other online encyclopedias in other languages and on other platforms.
author2 Jieh Hsiang
author_facet Jieh Hsiang
Yu-Chun Wang
王昱鈞
author Yu-Chun Wang
王昱鈞
spellingShingle Yu-Chun Wang
王昱鈞
Cross-Language Encyclopedia Article Linking
author_sort Yu-Chun Wang
title Cross-Language Encyclopedia Article Linking
title_short Cross-Language Encyclopedia Article Linking
title_full Cross-Language Encyclopedia Article Linking
title_fullStr Cross-Language Encyclopedia Article Linking
title_full_unstemmed Cross-Language Encyclopedia Article Linking
title_sort cross-language encyclopedia article linking
publishDate 2015
url http://ndltd.ncl.edu.tw/handle/00630180009911971519
work_keys_str_mv AT yuchunwang crosslanguageencyclopediaarticlelinking
AT wángyùjūn crosslanguageencyclopediaarticlelinking
AT yuchunwang kuàyǔyánxiànshàngbǎikēliánjié
AT wángyùjūn kuàyǔyánxiànshàngbǎikēliánjié
_version_ 1718395001926320128