Detecting Microblog Spam using User Behavior and Content Analysis

碩士 === 國立臺灣科技大學 === 資訊工程系 === 98 === In these years, Internet grows up quickly. Microblog is a new form of blog. A microblog differs from a traditional blog in that its content is typically much smaller, in both actual size and aggregate file size. Microbolog can post up to 140 characters on the aut...

Full description

Bibliographic Details
Main Authors: Shih-liang Chang, 張世良
Other Authors: Shi-Jinn Horng
Format: Others
Language:zh-TW
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/83217783551264850485
Description
Summary:碩士 === 國立臺灣科技大學 === 資訊工程系 === 98 === In these years, Internet grows up quickly. Microblog is a new form of blog. A microblog differs from a traditional blog in that its content is typically much smaller, in both actual size and aggregate file size. Microbolog can post up to 140 characters on the author's profile page. Because microblog is an easy way to contact with other people, spammer could use it to spread malicious links, sex ad and meaningless content to bother users. This paper propose a method that combines Content-based features and User-behavior features to identify if a mircoblog is a spam. The former is used to detect the relationship of the posted contents and the latter is used to detect the user’s behavior. The data in the experimental database all were collected from Twitter's users and there are 2100 users. Experimental results show that the detection rate of the proposed microblog spammer detector is up to 90%.