Islamic web pages filtering and categorization

The Internet creates the world without boundaries where people can get lots of information just by surfing the Internet. But still some of the information is not genuine and correct. Because of that, some of the practitioners of deviant teachings can take this opportunity to attract followers just u...

Full description

Bibliographic Details
Main Author: Mohd. Zamry, Nurfazrina (Author)
Format: Thesis
Published: 2013-06.
Subjects:
Online Access:Get fulltext
LEADER 02146 am a22001573u 4500
001 35863
042 |a dc 
100 1 0 |a Mohd. Zamry, Nurfazrina  |e author 
245 0 0 |a Islamic web pages filtering and categorization 
260 |c 2013-06. 
520 |a The Internet creates the world without boundaries where people can get lots of information just by surfing the Internet. But still some of the information is not genuine and correct. Because of that, some of the practitioners of deviant teachings can take this opportunity to attract followers just using the Internet especially to distort beliefs of Muslim in Malaysia. Web filtering can be used as protection against inappropriate and prevention of misuse of the network, hence, it can be used to filter the content of suspicious websites and alleviate the dissemination of such website. Currently, process for blocking the deviate teaching website is done manually and in addition there are limited web filtering product offered to filter religion content and very limited for Malay language. This project is aim to classify deviant teachings Website into three categories which is deviate, suspicious and clean. Pre-processing, feature selection and classification are process involved in Web filtering process. In pre-processing three processes are involved: HTML parsing, stemming and stopping to produce the deviant teaching keyword. Three existing term weighting scheme namely TF, TFIDF and Modified Entropy are used as feature selection process in filtering deviant teaching website while Support Vector Machine (SVM) will be used for classification process. Classification is validated by accuracy, precision, recall and F1. 300 Web pages were collected from Internet based on three categories: deviant teaching, suspicious and clean Web pages. As a result, M.Entropy shows the most suitable term weighting scheme to use in Islamic web pages filtering rather than TFIDF and Entropy. 
546 |a en 
650 0 4 |a TK5015.888 Web sites 
655 7 |a Thesis 
787 0 |n http://eprints.utm.my/id/eprint/35863/ 
856 |z Get fulltext  |u http://eprints.utm.my/id/eprint/35863/5/NurFazrinaMohdZamryMFSKSM2013.pdf