Application of Chinese Semantic in Spam Mail Filtering

碩士 === 淡江大學 === 統計學系碩士班 === 99 === In order to prevent spam mails, there are many achievement from the collective efforts of all sectors, although the protections become better and better, the challenges remain. The study focus on how much information is added in the odel, for this reason we hope to...

Full description

Bibliographic Details
Main Authors: Mei-Hua Chen, 陳美華
Other Authors: Ching-Hsiang Chen
Format: Others
Language:en_US
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/69303562326325799833
Description
Summary:碩士 === 淡江大學 === 統計學系碩士班 === 99 === In order to prevent spam mails, there are many achievement from the collective efforts of all sectors, although the protections become better and better, the challenges remain. The study focus on how much information is added in the odel, for this reason we hope to explain the output by meliorated version of input elements. We use 14 features of sender’s behavior and 20 keywords which calculated to be the most effectiveness by TF-IDF. Besides that, we proposed 24 new variables of semantic component that simulated the habits of writer and considered the expression between spam e-mail sender and ligitimate e-mail sender. The result shows that simultaneous use of all variables achieve the best results from the point of view of classifiers whatever in C4.5, MLP, or PNN.