URL String Information Extraction for Malicious URL Filter
碩士 === 國立臺灣科技大學 === 資訊工程系 === 100 === By the widespread adoption of web services, attacks over the web become regular threats, such as phishing and drive-by download. In reality, one million of URLs, which only contain about one hundred of malicious instances, are queried to the server for analyzing...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2012
|
Online Access: | http://ndltd.ncl.edu.tw/handle/zzvhv2 |
id |
ndltd-TW-100NTUS5392050 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-100NTUS53920502019-05-15T20:51:11Z http://ndltd.ncl.edu.tw/handle/zzvhv2 URL String Information Extraction for Malicious URL Filter 利用網址字串資訊萃取之惡意網址過濾器 Min-Sheng Lin 林閔笙 碩士 國立臺灣科技大學 資訊工程系 100 By the widespread adoption of web services, attacks over the web become regular threats, such as phishing and drive-by download. In reality, one million of URLs, which only contain about one hundred of malicious instances, are queried to the server for analyzing in one hour. It is impractical to analyze such an overwhelming amount of URLs by utilizing the content-based or host-based information. To overcome this overhead, we propose to use only the string information of the URLs, which are the lexical information and static characteristics of the URL strings, for filtering the malicious URLs. It is worth noting that the lexical information and static characteristics represent different natures of URL string. By exploring these two different kinds of information, two corresponding filters are built via different online learning algorithms. In our framework, the prediction results of these two filters are fused for the testing. In our experiments, the proposed filtering system can handle one million of URLs in 5 minutes and filter out 75% of URLs, which are regarded as benign. The remaining 25% suspicious URLs cover around 90% of the malicious ones. The promising result evidences that our proposed method is efficient and suitable for the analysis of large-scale URLs. Yuh-Jye Lee 李育杰 2012 學位論文 ; thesis 54 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣科技大學 === 資訊工程系 === 100 === By the widespread adoption of web services, attacks over the web become regular threats, such as phishing and drive-by download. In reality, one million of URLs, which only contain about one hundred of malicious instances, are queried to the server for analyzing in one hour. It is impractical to analyze such an overwhelming amount of URLs by utilizing the content-based or host-based information. To overcome this overhead, we propose to use only the string information of the URLs, which are the lexical information and static characteristics of the URL strings, for filtering the malicious URLs. It is worth noting that the lexical information and static characteristics represent different natures of URL string. By exploring these two different kinds of information, two corresponding filters are built via different online learning algorithms. In our framework, the prediction results of these two filters are fused for the testing. In our experiments, the proposed filtering system can handle one million of URLs in 5 minutes and filter out 75% of URLs, which are regarded as benign. The remaining 25% suspicious URLs cover around 90% of the malicious ones. The promising result evidences that our proposed method is efficient and suitable for the analysis of large-scale URLs.
|
author2 |
Yuh-Jye Lee |
author_facet |
Yuh-Jye Lee Min-Sheng Lin 林閔笙 |
author |
Min-Sheng Lin 林閔笙 |
spellingShingle |
Min-Sheng Lin 林閔笙 URL String Information Extraction for Malicious URL Filter |
author_sort |
Min-Sheng Lin |
title |
URL String Information Extraction for Malicious URL Filter |
title_short |
URL String Information Extraction for Malicious URL Filter |
title_full |
URL String Information Extraction for Malicious URL Filter |
title_fullStr |
URL String Information Extraction for Malicious URL Filter |
title_full_unstemmed |
URL String Information Extraction for Malicious URL Filter |
title_sort |
url string information extraction for malicious url filter |
publishDate |
2012 |
url |
http://ndltd.ncl.edu.tw/handle/zzvhv2 |
work_keys_str_mv |
AT minshenglin urlstringinformationextractionformaliciousurlfilter AT línmǐnshēng urlstringinformationextractionformaliciousurlfilter AT minshenglin lìyòngwǎngzhǐzìchuànzīxùncuìqǔzhīèyìwǎngzhǐguòlǜqì AT línmǐnshēng lìyòngwǎngzhǐzìchuànzīxùncuìqǔzhīèyìwǎngzhǐguòlǜqì |
_version_ |
1719104645939331072 |