Cross-Language Encyclopedia Article Linking
博士 === 國立臺灣大學 === 資訊工程學研究所 === 103 === Online encyclopedias, like Wikipedia, are one of the most widely used internet services around the world. Though Wikipedia has many language editions, their coverage is imbalanced when compared to the number of language users both online and offline. Furthermor...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2015
|
Online Access: | http://ndltd.ncl.edu.tw/handle/00630180009911971519 |
id |
ndltd-TW-103NTU05392099 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-103NTU053920992016-11-19T04:09:55Z http://ndltd.ncl.edu.tw/handle/00630180009911971519 Cross-Language Encyclopedia Article Linking 跨語言線上百科連結 Yu-Chun Wang 王昱鈞 博士 國立臺灣大學 資訊工程學研究所 103 Online encyclopedias, like Wikipedia, are one of the most widely used internet services around the world. Though Wikipedia has many language editions, their coverage is imbalanced when compared to the number of language users both online and offline. Furthermore, large alternative online encyclopedias exist for some languages, such as Chinese Baidu Baike. We could improve access to the knowledge in these various sources by constructing and integrating multiple online encyclopedias into large multilingual knowledge bases. The main task in such a project is creating links between articles in different encyclopedias in different languages. Most research to date has focused on linking articles in the different language editions of Wikipedia, yet little work has been done in linking other platform encyclopedias. In this thesis, we develop a method for cross-language encyclopedia article linking (CLEAL) between encyclopedias on different platforms, English Wikipedia and Chinese Baidu Baike. We use a bilingual topic model and translation features based on an SVM model to link articles between these two encyclopedias. To evaluate our approach, we compile datasets from Baidu Baike articles and their corresponding En Wikipedia articles. The evaluation results show that our approach achieves 0.8252 in MRR, outperforming the baseline system by 0.1745 (+26.82%). Our method does not heavily depend on specific platform formats or linguistic characteristics, so it could be easily extended to generate cross-language article links among other online encyclopedias in other languages and on other platforms. Jieh Hsiang 項潔 2015 學位論文 ; thesis 114 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立臺灣大學 === 資訊工程學研究所 === 103 === Online encyclopedias, like Wikipedia, are one of the most widely used internet services around the world. Though Wikipedia has many language editions, their coverage is imbalanced when compared to the number of language users both online and offline. Furthermore, large alternative online encyclopedias exist for some languages, such as Chinese Baidu Baike. We could improve access to the knowledge in these various sources by constructing and integrating multiple online encyclopedias into large multilingual knowledge bases. The main task in such a project is creating links between articles in different encyclopedias in different languages. Most research to date has focused on linking articles in the different language editions of Wikipedia, yet little work has been done in linking other platform encyclopedias. In this thesis, we develop a method for cross-language encyclopedia article linking (CLEAL) between encyclopedias on different platforms, English Wikipedia and Chinese Baidu Baike. We use a bilingual topic model and translation features based on an SVM model to link articles between these two encyclopedias. To evaluate our approach, we compile datasets from Baidu Baike articles and their corresponding En Wikipedia articles. The evaluation results show that our approach achieves 0.8252 in MRR, outperforming the baseline system by 0.1745 (+26.82%). Our method does not heavily depend on specific platform formats or linguistic characteristics, so it could be easily extended to generate cross-language article links among other online encyclopedias in other languages and on other platforms.
|
author2 |
Jieh Hsiang |
author_facet |
Jieh Hsiang Yu-Chun Wang 王昱鈞 |
author |
Yu-Chun Wang 王昱鈞 |
spellingShingle |
Yu-Chun Wang 王昱鈞 Cross-Language Encyclopedia Article Linking |
author_sort |
Yu-Chun Wang |
title |
Cross-Language Encyclopedia Article Linking |
title_short |
Cross-Language Encyclopedia Article Linking |
title_full |
Cross-Language Encyclopedia Article Linking |
title_fullStr |
Cross-Language Encyclopedia Article Linking |
title_full_unstemmed |
Cross-Language Encyclopedia Article Linking |
title_sort |
cross-language encyclopedia article linking |
publishDate |
2015 |
url |
http://ndltd.ncl.edu.tw/handle/00630180009911971519 |
work_keys_str_mv |
AT yuchunwang crosslanguageencyclopediaarticlelinking AT wángyùjūn crosslanguageencyclopediaarticlelinking AT yuchunwang kuàyǔyánxiànshàngbǎikēliánjié AT wángyùjūn kuàyǔyánxiànshàngbǎikēliánjié |
_version_ |
1718395001926320128 |