Summary: | 碩士 === 義守大學 === 資訊管理學系碩士班 === 97 === Because of the popularity of Internet and wide use of E-mail the volume of spam mails keeps growing rapidly. The growing volume of spam mails annoys people and affects work efficiency significantly. Most previous researches focused on developing spam filtering algorithm, using statistics or data mining approach to develop precise spam rules. However, mail servers may generate new spam rules constantly and mail server will then carry a growing number of spam rules. The rules might be out-of-date or imprecise to classification as spam evolves continuously and hence applying such rules might cause misclassification. In addition, too many rules in mail server may affect the performance of mail filters. In this research, we propose an anti-spam approach combining both data mining and statistical test approach. We adopt data mining to generate spam rules and statistical test to evaluate the efficiency of them. By the efficiency of spam rules, only significant rules will be used to classify emails and the rest of rules can be eliminated then for performance improvement.
|