Summary: | 碩士 === 國立東華大學 === 數位知識管理碩士學位學程 === 98 === Since the rapid development of Internet, there is almost no distance of
information transmission. It is convenient to deliver some messages among Internet,
but user usually received a large number of unsolicited mails which are called spam or
junk mails. In order to eliminate the problem of spam, it is important to develop an
efficient mechanism for anti-spam filter.
In this research, we proposed an effective spam filtering mechanism based on
analyzing the features of e-mail’s structure. Our filtering system was divided into
training phase, classification phase, and re-learning phase. In training phase, we
applied the decision tree data mining algorithm to find the association rules between
attributes of mails in e-mail header, attachment and image structure. The rules would
be applied to classify spam mails in classification phase. And in re-learning phase, we
maintained the rules according to the misjudgment mails. Moreover, we extracted new
keywords of spam mails into the spam keyword database to strengthen our system.
Therefore, the misjudgment rate of system would be reduced.
According to the experiment, the accuracy rate of our method was up to 99%, the
precision rate was up to 98%, and the recall rate was up to 100%. It’s obviously that
our method would be an efficient mechanism for filtering spam.
|