A Research for The Two-Tier Filtering Scheme of Anti-Spam

碩士 === 銘傳大學 === 資訊傳播工程學系碩士班 === 94 === Support Vector Machine (SVM) and Naïve Bayes are well-known machine-learning algorithms for the application of content filtering against spam. On the basis of fast classification through the hyper-plane of SVM and flexible threshold setting of Bayes, in this th...

Full description

Bibliographic Details
Main Authors: Hsien-Chun Chang, 張僩鈞
Other Authors: Sheng-Cheng Yeh
Format: Others
Language:zh-TW
Published: 2006
Online Access:http://ndltd.ncl.edu.tw/handle/2624zh
id ndltd-TW-094MCU05676002
record_format oai_dc
spelling ndltd-TW-094MCU056760022018-04-10T17:13:14Z http://ndltd.ncl.edu.tw/handle/2624zh A Research for The Two-Tier Filtering Scheme of Anti-Spam 兩階層式垃圾郵件過濾機制之研究 Hsien-Chun Chang 張僩鈞 碩士 銘傳大學 資訊傳播工程學系碩士班 94 Support Vector Machine (SVM) and Naïve Bayes are well-known machine-learning algorithms for the application of content filtering against spam. On the basis of fast classification through the hyper-plane of SVM and flexible threshold setting of Bayes, in this thesis we proposed a two-tier filtering scheme which combine SVM and new Naïve Bayes model for anti-spam. In the first tier, Information Gain is the way to decide keywords for training vector of SVM. This thesis also defines four margin of the hyper-plane and pick out the testing data which locate on the scope for the second tier Bayesian probability calculation, in order to decide the category of sample data. As the results of our research which indicated that all kinds of the margin setting bring the improved accuracy (Accur) about 1%~4%, especially the Maximum Distance and Average Distance Margin. Additionally, the optimal model performs the total accuracy of Chinese and English data above 97%. However, the proposed two-tier filtering scheme and new Naïve Bayes model were verified with availability and contribution. Sheng-Cheng Yeh 葉生正 2006 學位論文 ; thesis 66 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 銘傳大學 === 資訊傳播工程學系碩士班 === 94 === Support Vector Machine (SVM) and Naïve Bayes are well-known machine-learning algorithms for the application of content filtering against spam. On the basis of fast classification through the hyper-plane of SVM and flexible threshold setting of Bayes, in this thesis we proposed a two-tier filtering scheme which combine SVM and new Naïve Bayes model for anti-spam. In the first tier, Information Gain is the way to decide keywords for training vector of SVM. This thesis also defines four margin of the hyper-plane and pick out the testing data which locate on the scope for the second tier Bayesian probability calculation, in order to decide the category of sample data. As the results of our research which indicated that all kinds of the margin setting bring the improved accuracy (Accur) about 1%~4%, especially the Maximum Distance and Average Distance Margin. Additionally, the optimal model performs the total accuracy of Chinese and English data above 97%. However, the proposed two-tier filtering scheme and new Naïve Bayes model were verified with availability and contribution.
author2 Sheng-Cheng Yeh
author_facet Sheng-Cheng Yeh
Hsien-Chun Chang
張僩鈞
author Hsien-Chun Chang
張僩鈞
spellingShingle Hsien-Chun Chang
張僩鈞
A Research for The Two-Tier Filtering Scheme of Anti-Spam
author_sort Hsien-Chun Chang
title A Research for The Two-Tier Filtering Scheme of Anti-Spam
title_short A Research for The Two-Tier Filtering Scheme of Anti-Spam
title_full A Research for The Two-Tier Filtering Scheme of Anti-Spam
title_fullStr A Research for The Two-Tier Filtering Scheme of Anti-Spam
title_full_unstemmed A Research for The Two-Tier Filtering Scheme of Anti-Spam
title_sort research for the two-tier filtering scheme of anti-spam
publishDate 2006
url http://ndltd.ncl.edu.tw/handle/2624zh
work_keys_str_mv AT hsienchunchang aresearchforthetwotierfilteringschemeofantispam
AT zhāngxiànjūn aresearchforthetwotierfilteringschemeofantispam
AT hsienchunchang liǎngjiēcéngshìlājīyóujiànguòlǜjīzhìzhīyánjiū
AT zhāngxiànjūn liǎngjiēcéngshìlājīyóujiànguòlǜjīzhìzhīyánjiū
AT hsienchunchang researchforthetwotierfilteringschemeofantispam
AT zhāngxiànjūn researchforthetwotierfilteringschemeofantispam
_version_ 1718624936457666560