An Effective Identification Technology for Online News Comment Spammers in Internet Media
With the development of mobile Internet, it is changing the way we communicate with others. Internet media have gradually become the main mobile crowdsourcing applications for information dissemination and user communication, including online news and social networks. However, the potential business...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8648390/ |
id |
doaj-12fabd1476f0439290c533e9939aa9e8 |
---|---|
record_format |
Article |
spelling |
doaj-12fabd1476f0439290c533e9939aa9e82021-04-05T17:00:19ZengIEEEIEEE Access2169-35362019-01-017377923780610.1109/ACCESS.2019.29004748648390An Effective Identification Technology for Online News Comment Spammers in Internet MediaHuayou Si0https://orcid.org/0000-0002-8022-923XWen Sun1Jilin Zhang2Jian Wan3Neal N. Xiong4https://orcid.org/0000-0002-0394-4635Li Zhou5Yongjian Ren6School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaSchool of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaSchool of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaSchool of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaCollege of Intelligence and Computing, Tianjin University, Tianjin, ChinaSchool of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaSchool of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaWith the development of mobile Internet, it is changing the way we communicate with others. Internet media have gradually become the main mobile crowdsourcing applications for information dissemination and user communication, including online news and social networks. However, the potential business opportunities have stimulated the emergence of a large number of spammers, who release false speech, advertisements, pornographic contents, and phishing websites on the media to gain commercial benefits, which seriously affects the experience of normal users. Therefore, in order to reduce the harm of false information, the research on the identification technology of spammers has been carried out extensively. However, the traditional technologies of identifying spammers involve high data costs and poor effects, and most of them are concentrated in the field of social networks, while less research is carried out in the field of online news. In this paper, we propose an effective technology of identifying online news comment spammers based on the label propagation algorithm (LPA), making full use of the user comment behaviors and contents. First of all, we collect a large amount of news and comments from NetEase News and label some users in the data as spammers or normal users manually to construct a labeled dataset. Then, a set of behavioral and semantic features are extracted and quantified from the user comment behaviors and comment contents by statistical analysis. Next, we propose the identification technology based on the LPA. Finally, the set of feature values is input into the proposed technology in different combinations, and experiments and evaluations are carried out to determine the most effective combination of features and improve the technology. The results show that the technology proposed in this paper involves a lower data cost but a better identification effect than some traditional technologies based on the supervised classifier.https://ieeexplore.ieee.org/document/8648390/Spammer identificationInternet mediaonline news commentlabel propagation algorithmmobile crowdsourcing applications |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Huayou Si Wen Sun Jilin Zhang Jian Wan Neal N. Xiong Li Zhou Yongjian Ren |
spellingShingle |
Huayou Si Wen Sun Jilin Zhang Jian Wan Neal N. Xiong Li Zhou Yongjian Ren An Effective Identification Technology for Online News Comment Spammers in Internet Media IEEE Access Spammer identification Internet media online news comment label propagation algorithm mobile crowdsourcing applications |
author_facet |
Huayou Si Wen Sun Jilin Zhang Jian Wan Neal N. Xiong Li Zhou Yongjian Ren |
author_sort |
Huayou Si |
title |
An Effective Identification Technology for Online News Comment Spammers in Internet Media |
title_short |
An Effective Identification Technology for Online News Comment Spammers in Internet Media |
title_full |
An Effective Identification Technology for Online News Comment Spammers in Internet Media |
title_fullStr |
An Effective Identification Technology for Online News Comment Spammers in Internet Media |
title_full_unstemmed |
An Effective Identification Technology for Online News Comment Spammers in Internet Media |
title_sort |
effective identification technology for online news comment spammers in internet media |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2019-01-01 |
description |
With the development of mobile Internet, it is changing the way we communicate with others. Internet media have gradually become the main mobile crowdsourcing applications for information dissemination and user communication, including online news and social networks. However, the potential business opportunities have stimulated the emergence of a large number of spammers, who release false speech, advertisements, pornographic contents, and phishing websites on the media to gain commercial benefits, which seriously affects the experience of normal users. Therefore, in order to reduce the harm of false information, the research on the identification technology of spammers has been carried out extensively. However, the traditional technologies of identifying spammers involve high data costs and poor effects, and most of them are concentrated in the field of social networks, while less research is carried out in the field of online news. In this paper, we propose an effective technology of identifying online news comment spammers based on the label propagation algorithm (LPA), making full use of the user comment behaviors and contents. First of all, we collect a large amount of news and comments from NetEase News and label some users in the data as spammers or normal users manually to construct a labeled dataset. Then, a set of behavioral and semantic features are extracted and quantified from the user comment behaviors and comment contents by statistical analysis. Next, we propose the identification technology based on the LPA. Finally, the set of feature values is input into the proposed technology in different combinations, and experiments and evaluations are carried out to determine the most effective combination of features and improve the technology. The results show that the technology proposed in this paper involves a lower data cost but a better identification effect than some traditional technologies based on the supervised classifier. |
topic |
Spammer identification Internet media online news comment label propagation algorithm mobile crowdsourcing applications |
url |
https://ieeexplore.ieee.org/document/8648390/ |
work_keys_str_mv |
AT huayousi aneffectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia AT wensun aneffectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia AT jilinzhang aneffectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia AT jianwan aneffectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia AT nealnxiong aneffectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia AT lizhou aneffectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia AT yongjianren aneffectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia AT huayousi effectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia AT wensun effectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia AT jilinzhang effectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia AT jianwan effectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia AT nealnxiong effectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia AT lizhou effectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia AT yongjianren effectiveidentificationtechnologyforonlinenewscommentspammersininternetmedia |
_version_ |
1721540443844378624 |