Summary: | 碩士 === 大同大學 === 資訊工程學系(所) === 96 === Because of the popularization of Internet and the speed up of the network, people do the first thing after they turn on the computer is read E-mail. Just like the regular post mailbox in the family, our e-mail mailbox was often finding the traces of spam mail. Too much spam has become the biggest worry from user to receive e-mail.
The usage of the E-mail will not stop just because of existence of the spam. But the overflowing of spam let user feel vexed endlessly. This is because not only the long time of receive and read mail but also consume the mind to delete and filter the mail. And a large number of junk emails take up the mailbox space, if we don't clearing up immediately; even the normal mail is unable to receive.
The main purpose of our research was to develop an e-mail classification system based on Back-Propagation Network. We adopt the technology of automatic text categorization. We first extracted the important features from mail file. Then we use the Chinese segmentation algorithm to process mail subject and content. We using keyword selection and weighting algorithm to find mail keyword and calculate similarity. Finally, we combine Back-Propagation Network and similarity value to achieve the e-mail classification and automatically filter spam mail.
The experimental result shows that the system can accomplish the classification function. We also achieve good recall and precision rate in spam mail filtering. We hope to help users to lighten their burdens to receive mails and to reduce the resources of the network; indeed, we reduced the e-mail processing time, but also decrease the amount of spam.
|