WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure

碩士 === 國立臺灣科技大學 === 資訊工程系 === 103 === Wikipedia is a multi-language and wealth-content online encyclopedia. Based on the concept of Web2.0, Wikipedia allows anyone to share and edit Wikipedia content, which also makes Wikipedia easily to be destroyed. Therefore, all Wikipedians pay long-term sustain...

Full description

Bibliographic Details
Main Authors: Shr-Tzung Bai, 白士宗
Other Authors: none
Format: Others
Language:en_US
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/20086078542352181367
id ndltd-TW-103NTUS5392049
record_format oai_dc
spelling ndltd-TW-103NTUS53920492016-11-06T04:19:40Z http://ndltd.ncl.edu.tw/handle/20086078542352181367 WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure WikiSERM:基於連續事件風險衡量之維基百科版本破壞偵測 Shr-Tzung Bai 白士宗 碩士 國立臺灣科技大學 資訊工程系 103 Wikipedia is a multi-language and wealth-content online encyclopedia. Based on the concept of Web2.0, Wikipedia allows anyone to share and edit Wikipedia content, which also makes Wikipedia easily to be destroyed. Therefore, all Wikipedians pay long-term sustained effort on maintaining the quality of Wikipedia content. The past research directions for Wikipedia vandalism detection focused on the text semantic, feature statistical and machine learning. The current directions focus on language-independent feature analysis and continuity content-context correlation analysis. WikiSERM extracts the key-item based on the Wikipedia edit tag which applies to various languages of Wikipedia. Therefore key-item based on the Wikipedia edit tag has the language-independent feature, and it makes WikiSERM be applied in various language versions of Wikipedia vandalism detection. WikiSERM take the full version of the article as evidence to judge risk trends and to analyze using-status of each key item in a Wikipedia article (e.g., keeps being used or completely deleted). Through analysis of the continuity of key item using-status, we can get risk status of each key item in each corresponding revision. WikiSERM records those risk results of previous revision as two-dimensional array for querying quickly, therefore WikiSERM has the ability to deal with the incremental data and to provide the risk assessment result immediately. Through the analysis of key item transaction in the Wikipedia revision (e.g., add high-risk key item, add low-risk key item, delete high-risk key item, delete low-risk key item), WikiSERM take those over-threshold revisions as a high risk version. Our approach can help Wikipedia administrators to quickly find vandalism revision, and identify which is the high-risk key item in the vandalism revision. none 李漢銘 2015 學位論文 ; thesis 71 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立臺灣科技大學 === 資訊工程系 === 103 === Wikipedia is a multi-language and wealth-content online encyclopedia. Based on the concept of Web2.0, Wikipedia allows anyone to share and edit Wikipedia content, which also makes Wikipedia easily to be destroyed. Therefore, all Wikipedians pay long-term sustained effort on maintaining the quality of Wikipedia content. The past research directions for Wikipedia vandalism detection focused on the text semantic, feature statistical and machine learning. The current directions focus on language-independent feature analysis and continuity content-context correlation analysis. WikiSERM extracts the key-item based on the Wikipedia edit tag which applies to various languages of Wikipedia. Therefore key-item based on the Wikipedia edit tag has the language-independent feature, and it makes WikiSERM be applied in various language versions of Wikipedia vandalism detection. WikiSERM take the full version of the article as evidence to judge risk trends and to analyze using-status of each key item in a Wikipedia article (e.g., keeps being used or completely deleted). Through analysis of the continuity of key item using-status, we can get risk status of each key item in each corresponding revision. WikiSERM records those risk results of previous revision as two-dimensional array for querying quickly, therefore WikiSERM has the ability to deal with the incremental data and to provide the risk assessment result immediately. Through the analysis of key item transaction in the Wikipedia revision (e.g., add high-risk key item, add low-risk key item, delete high-risk key item, delete low-risk key item), WikiSERM take those over-threshold revisions as a high risk version. Our approach can help Wikipedia administrators to quickly find vandalism revision, and identify which is the high-risk key item in the vandalism revision.
author2 none
author_facet none
Shr-Tzung Bai
白士宗
author Shr-Tzung Bai
白士宗
spellingShingle Shr-Tzung Bai
白士宗
WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure
author_sort Shr-Tzung Bai
title WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure
title_short WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure
title_full WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure
title_fullStr WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure
title_full_unstemmed WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure
title_sort wikiserm: wikipedia vandalism detection through sequential event risk measure
publishDate 2015
url http://ndltd.ncl.edu.tw/handle/20086078542352181367
work_keys_str_mv AT shrtzungbai wikisermwikipediavandalismdetectionthroughsequentialeventriskmeasure
AT báishìzōng wikisermwikipediavandalismdetectionthroughsequentialeventriskmeasure
AT shrtzungbai wikisermjīyúliánxùshìjiànfēngxiǎnhéngliàngzhīwéijībǎikēbǎnběnpòhuàizhēncè
AT báishìzōng wikisermjīyúliánxùshìjiànfēngxiǎnhéngliàngzhīwéijībǎikēbǎnběnpòhuàizhēncè
_version_ 1718391518127980544