A Memory Efficient DFA using Compression and Pattern Segmentation

碩士 === 國立成功大學 === 資訊工程學系 === 102 === As the traffic of Internet is grow quickly, there are more and more malicious attacks and viruses spread over the Internet. The role of network security systems such as network intrusion detection system (NIDS), firewalls, and antivirus software have become more...

Full description

Bibliographic Details
Main Authors: Yuen-ShuoLi, 李袁碩
Other Authors: Yeim-Kuan Chang
Format: Others
Language:en_US
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/33470851965807507696
id ndltd-TW-102NCKU5392104
record_format oai_dc
spelling ndltd-TW-102NCKU53921042015-10-14T00:12:48Z http://ndltd.ncl.edu.tw/handle/33470851965807507696 A Memory Efficient DFA using Compression and Pattern Segmentation 藉由壓縮技術和字串切割實現節省記憶體的 DFA Yuen-ShuoLi 李袁碩 碩士 國立成功大學 資訊工程學系 102 As the traffic of Internet is grow quickly, there are more and more malicious attacks and viruses spread over the Internet. The role of network security systems such as network intrusion detection system (NIDS), firewalls, and antivirus software have become more important. These systems usually use regexes to inspect the payload of packet, which is one of the most intensive tasks in the systems. We have to develop a high-throughput algorithm that requires a small amount of memory to find out the hidden virus in packet payload. Regular expressions is more complicated than simple patterns. When multiple regular expressions are processed together, the corresponding DFA may be so complicated and need a large amount of memory. The numbers of states and transitions grow so fast when a new regular expression is added. As the virus and malicious packets grow quickly, more patterns is needed to be compared. Therefore, we hope to find a scheme to decrease the memory consumption of DFA and keep the search performance acceptable. In this thesis, we propose a memory efficient parallel compatible DFA that uses the techniques of compression and pattern segmentation. With the idea of PFAC algorithm, we propose a new DFA called compressed segmented DFA (CS-DFA) that needs less number of states and transitions than δFA. Without considering the symbols of “.*” in the beginning of the regular expression, we notice that a DFA state transitions for most symbols to the same next state. As a result, the transition table can be compressed by run-length encoding. The number of transitions in the proposed CS-DFA is about a half of the transitions need in δFA, and has 74% of the memory consumed by δFA. In addition, we focus on two conditions “Counting Constraints” and “Kleene Star” which cause state blowup frequently in pattern sets. Then the technique of pattern segmentation is applied. The memory consumption is smaller than pure CS-DFA without pattern segmentation by about 19%. The search performance is still acceptable when using OpenMP. Yeim-Kuan Chang 張燕光 2014 學位論文 ; thesis 59 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立成功大學 === 資訊工程學系 === 102 === As the traffic of Internet is grow quickly, there are more and more malicious attacks and viruses spread over the Internet. The role of network security systems such as network intrusion detection system (NIDS), firewalls, and antivirus software have become more important. These systems usually use regexes to inspect the payload of packet, which is one of the most intensive tasks in the systems. We have to develop a high-throughput algorithm that requires a small amount of memory to find out the hidden virus in packet payload. Regular expressions is more complicated than simple patterns. When multiple regular expressions are processed together, the corresponding DFA may be so complicated and need a large amount of memory. The numbers of states and transitions grow so fast when a new regular expression is added. As the virus and malicious packets grow quickly, more patterns is needed to be compared. Therefore, we hope to find a scheme to decrease the memory consumption of DFA and keep the search performance acceptable. In this thesis, we propose a memory efficient parallel compatible DFA that uses the techniques of compression and pattern segmentation. With the idea of PFAC algorithm, we propose a new DFA called compressed segmented DFA (CS-DFA) that needs less number of states and transitions than δFA. Without considering the symbols of “.*” in the beginning of the regular expression, we notice that a DFA state transitions for most symbols to the same next state. As a result, the transition table can be compressed by run-length encoding. The number of transitions in the proposed CS-DFA is about a half of the transitions need in δFA, and has 74% of the memory consumed by δFA. In addition, we focus on two conditions “Counting Constraints” and “Kleene Star” which cause state blowup frequently in pattern sets. Then the technique of pattern segmentation is applied. The memory consumption is smaller than pure CS-DFA without pattern segmentation by about 19%. The search performance is still acceptable when using OpenMP.
author2 Yeim-Kuan Chang
author_facet Yeim-Kuan Chang
Yuen-ShuoLi
李袁碩
author Yuen-ShuoLi
李袁碩
spellingShingle Yuen-ShuoLi
李袁碩
A Memory Efficient DFA using Compression and Pattern Segmentation
author_sort Yuen-ShuoLi
title A Memory Efficient DFA using Compression and Pattern Segmentation
title_short A Memory Efficient DFA using Compression and Pattern Segmentation
title_full A Memory Efficient DFA using Compression and Pattern Segmentation
title_fullStr A Memory Efficient DFA using Compression and Pattern Segmentation
title_full_unstemmed A Memory Efficient DFA using Compression and Pattern Segmentation
title_sort memory efficient dfa using compression and pattern segmentation
publishDate 2014
url http://ndltd.ncl.edu.tw/handle/33470851965807507696
work_keys_str_mv AT yuenshuoli amemoryefficientdfausingcompressionandpatternsegmentation
AT lǐyuánshuò amemoryefficientdfausingcompressionandpatternsegmentation
AT yuenshuoli jíyóuyāsuōjìshùhézìchuànqiègēshíxiànjiéshěngjìyìtǐdedfa
AT lǐyuánshuò jíyóuyāsuōjìshùhézìchuànqiègēshíxiànjiéshěngjìyìtǐdedfa
AT yuenshuoli memoryefficientdfausingcompressionandpatternsegmentation
AT lǐyuánshuò memoryefficientdfausingcompressionandpatternsegmentation
_version_ 1718087708456255488