URL String Information Extraction for Malicious URL Filter

碩士 === 國立臺灣科技大學 === 資訊工程系 === 100 === By the widespread adoption of web services, attacks over the web become regular threats, such as phishing and drive-by download. In reality, one million of URLs, which only contain about one hundred of malicious instances, are queried to the server for analyzing...

Full description

Bibliographic Details
Main Authors: Min-Sheng Lin, 林閔笙
Other Authors: Yuh-Jye Lee
Format: Others
Language:en_US
Published: 2012
Online Access:http://ndltd.ncl.edu.tw/handle/zzvhv2
id ndltd-TW-100NTUS5392050
record_format oai_dc
spelling ndltd-TW-100NTUS53920502019-05-15T20:51:11Z http://ndltd.ncl.edu.tw/handle/zzvhv2 URL String Information Extraction for Malicious URL Filter 利用網址字串資訊萃取之惡意網址過濾器 Min-Sheng Lin 林閔笙 碩士 國立臺灣科技大學 資訊工程系 100 By the widespread adoption of web services, attacks over the web become regular threats, such as phishing and drive-by download. In reality, one million of URLs, which only contain about one hundred of malicious instances, are queried to the server for analyzing in one hour. It is impractical to analyze such an overwhelming amount of URLs by utilizing the content-based or host-based information. To overcome this overhead, we propose to use only the string information of the URLs, which are the lexical information and static characteristics of the URL strings, for filtering the malicious URLs. It is worth noting that the lexical information and static characteristics represent different natures of URL string. By exploring these two different kinds of information, two corresponding filters are built via different online learning algorithms. In our framework, the prediction results of these two filters are fused for the testing. In our experiments, the proposed filtering system can handle one million of URLs in 5 minutes and filter out 75% of URLs, which are regarded as benign. The remaining 25% suspicious URLs cover around 90% of the malicious ones. The promising result evidences that our proposed method is efficient and suitable for the analysis of large-scale URLs. Yuh-Jye Lee 李育杰 2012 學位論文 ; thesis 54 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立臺灣科技大學 === 資訊工程系 === 100 === By the widespread adoption of web services, attacks over the web become regular threats, such as phishing and drive-by download. In reality, one million of URLs, which only contain about one hundred of malicious instances, are queried to the server for analyzing in one hour. It is impractical to analyze such an overwhelming amount of URLs by utilizing the content-based or host-based information. To overcome this overhead, we propose to use only the string information of the URLs, which are the lexical information and static characteristics of the URL strings, for filtering the malicious URLs. It is worth noting that the lexical information and static characteristics represent different natures of URL string. By exploring these two different kinds of information, two corresponding filters are built via different online learning algorithms. In our framework, the prediction results of these two filters are fused for the testing. In our experiments, the proposed filtering system can handle one million of URLs in 5 minutes and filter out 75% of URLs, which are regarded as benign. The remaining 25% suspicious URLs cover around 90% of the malicious ones. The promising result evidences that our proposed method is efficient and suitable for the analysis of large-scale URLs.
author2 Yuh-Jye Lee
author_facet Yuh-Jye Lee
Min-Sheng Lin
林閔笙
author Min-Sheng Lin
林閔笙
spellingShingle Min-Sheng Lin
林閔笙
URL String Information Extraction for Malicious URL Filter
author_sort Min-Sheng Lin
title URL String Information Extraction for Malicious URL Filter
title_short URL String Information Extraction for Malicious URL Filter
title_full URL String Information Extraction for Malicious URL Filter
title_fullStr URL String Information Extraction for Malicious URL Filter
title_full_unstemmed URL String Information Extraction for Malicious URL Filter
title_sort url string information extraction for malicious url filter
publishDate 2012
url http://ndltd.ncl.edu.tw/handle/zzvhv2
work_keys_str_mv AT minshenglin urlstringinformationextractionformaliciousurlfilter
AT línmǐnshēng urlstringinformationextractionformaliciousurlfilter
AT minshenglin lìyòngwǎngzhǐzìchuànzīxùncuìqǔzhīèyìwǎngzhǐguòlǜqì
AT línmǐnshēng lìyòngwǎngzhǐzìchuànzīxùncuìqǔzhīèyìwǎngzhǐguòlǜqì
_version_ 1719104645939331072