WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure
碩士 === 國立臺灣科技大學 === 資訊工程系 === 103 === Wikipedia is a multi-language and wealth-content online encyclopedia. Based on the concept of Web2.0, Wikipedia allows anyone to share and edit Wikipedia content, which also makes Wikipedia easily to be destroyed. Therefore, all Wikipedians pay long-term sustain...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2015
|
Online Access: | http://ndltd.ncl.edu.tw/handle/20086078542352181367 |
id |
ndltd-TW-103NTUS5392049 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-103NTUS53920492016-11-06T04:19:40Z http://ndltd.ncl.edu.tw/handle/20086078542352181367 WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure WikiSERM:基於連續事件風險衡量之維基百科版本破壞偵測 Shr-Tzung Bai 白士宗 碩士 國立臺灣科技大學 資訊工程系 103 Wikipedia is a multi-language and wealth-content online encyclopedia. Based on the concept of Web2.0, Wikipedia allows anyone to share and edit Wikipedia content, which also makes Wikipedia easily to be destroyed. Therefore, all Wikipedians pay long-term sustained effort on maintaining the quality of Wikipedia content. The past research directions for Wikipedia vandalism detection focused on the text semantic, feature statistical and machine learning. The current directions focus on language-independent feature analysis and continuity content-context correlation analysis. WikiSERM extracts the key-item based on the Wikipedia edit tag which applies to various languages of Wikipedia. Therefore key-item based on the Wikipedia edit tag has the language-independent feature, and it makes WikiSERM be applied in various language versions of Wikipedia vandalism detection. WikiSERM take the full version of the article as evidence to judge risk trends and to analyze using-status of each key item in a Wikipedia article (e.g., keeps being used or completely deleted). Through analysis of the continuity of key item using-status, we can get risk status of each key item in each corresponding revision. WikiSERM records those risk results of previous revision as two-dimensional array for querying quickly, therefore WikiSERM has the ability to deal with the incremental data and to provide the risk assessment result immediately. Through the analysis of key item transaction in the Wikipedia revision (e.g., add high-risk key item, add low-risk key item, delete high-risk key item, delete low-risk key item), WikiSERM take those over-threshold revisions as a high risk version. Our approach can help Wikipedia administrators to quickly find vandalism revision, and identify which is the high-risk key item in the vandalism revision. none 李漢銘 2015 學位論文 ; thesis 71 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣科技大學 === 資訊工程系 === 103 === Wikipedia is a multi-language and wealth-content online encyclopedia. Based on the concept of Web2.0, Wikipedia allows anyone to share and edit Wikipedia content, which also makes Wikipedia easily to be destroyed. Therefore, all Wikipedians pay long-term sustained effort on maintaining the quality of Wikipedia content. The past research directions for Wikipedia vandalism detection focused on the text semantic, feature statistical and machine learning. The current directions focus on language-independent feature analysis and continuity content-context correlation analysis. WikiSERM extracts the key-item based on the Wikipedia edit tag which applies to various languages of Wikipedia. Therefore key-item based on the Wikipedia edit tag has the language-independent feature, and it makes WikiSERM be applied in various language versions of Wikipedia vandalism detection. WikiSERM take the full version of the article as evidence to judge risk trends and to analyze using-status of each key item in a Wikipedia article (e.g., keeps being used or completely deleted). Through analysis of the continuity of key item using-status, we can get risk status of each key item in each corresponding revision. WikiSERM records those risk results of previous revision as two-dimensional array for querying quickly, therefore WikiSERM has the ability to deal with the incremental data and to provide the risk assessment result immediately. Through the analysis of key item transaction in the Wikipedia revision (e.g., add high-risk key item, add low-risk key item, delete high-risk key item, delete low-risk key item), WikiSERM take those over-threshold revisions as a high risk version. Our approach can help Wikipedia administrators to quickly find vandalism revision, and identify which is the high-risk key item in the vandalism revision.
|
author2 |
none |
author_facet |
none Shr-Tzung Bai 白士宗 |
author |
Shr-Tzung Bai 白士宗 |
spellingShingle |
Shr-Tzung Bai 白士宗 WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure |
author_sort |
Shr-Tzung Bai |
title |
WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure |
title_short |
WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure |
title_full |
WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure |
title_fullStr |
WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure |
title_full_unstemmed |
WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure |
title_sort |
wikiserm: wikipedia vandalism detection through sequential event risk measure |
publishDate |
2015 |
url |
http://ndltd.ncl.edu.tw/handle/20086078542352181367 |
work_keys_str_mv |
AT shrtzungbai wikisermwikipediavandalismdetectionthroughsequentialeventriskmeasure AT báishìzōng wikisermwikipediavandalismdetectionthroughsequentialeventriskmeasure AT shrtzungbai wikisermjīyúliánxùshìjiànfēngxiǎnhéngliàngzhīwéijībǎikēbǎnběnpòhuàizhēncè AT báishìzōng wikisermjīyúliánxùshìjiànfēngxiǎnhéngliàngzhīwéijībǎikēbǎnběnpòhuàizhēncè |
_version_ |
1718391518127980544 |