Using Information Retrieval Approach for Malware Classification

碩士 === 國立成功大學 === 電腦與通信工程研究所 === 102 === In this paper, we propose an Information Retrieval (IR) approach to classify malware samples into known malware families. First, each training sample will be sent into a dynamic analyzer tool – cuckoo sandbox to obtain the information of API (Application Prog...

Full description

Bibliographic Details
Main Authors:	Tzung-ShianTsai, 蔡宗憲
Other Authors:	Chu-Sing Yang
Format:	Others
Language:	zh-TW
Published:	2014
Online Access:	http://ndltd.ncl.edu.tw/handle/22111470442079737846

id	ndltd-TW-102NCKU5652039
record_format	oai_dc
spelling	ndltd-TW-102NCKU56520392016-03-07T04:10:57Z http://ndltd.ncl.edu.tw/handle/22111470442079737846 Using Information Retrieval Approach for Malware Classification 利用資訊檢索方式於惡意程式分類之研究 Tzung-ShianTsai 蔡宗憲碩士國立成功大學電腦與通信工程研究所 102 In this paper, we propose an Information Retrieval (IR) approach to classify malware samples into known malware families. First, each training sample will be sent into a dynamic analyzer tool – cuckoo sandbox to obtain the information of API (Application Programming Interface) calls which are called by sample. Every system call consists of three parts: function name, parameter name and parameter value. At the retrieval phase, perform the same procedure with the testing sample. Then, use TF-IDF (Term Frequency-Inverse Document Frequency) algorithm to model the test sample and all training samples as vector representation based on the API call information. This vector describes the behavioral characteristics of malware and is used to compare the similarity of behavior. Finally, find the malware category by retrieving the most similar family to achieve the purpose of malware classification. Chu-Sing Yang 楊竹星 2014 學位論文 ; thesis 39 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立成功大學 === 電腦與通信工程研究所 === 102 === In this paper, we propose an Information Retrieval (IR) approach to classify malware samples into known malware families. First, each training sample will be sent into a dynamic analyzer tool – cuckoo sandbox to obtain the information of API (Application Programming Interface) calls which are called by sample. Every system call consists of three parts: function name, parameter name and parameter value. At the retrieval phase, perform the same procedure with the testing sample. Then, use TF-IDF (Term Frequency-Inverse Document Frequency) algorithm to model the test sample and all training samples as vector representation based on the API call information. This vector describes the behavioral characteristics of malware and is used to compare the similarity of behavior. Finally, find the malware category by retrieving the most similar family to achieve the purpose of malware classification.
author2	Chu-Sing Yang
author_facet	Chu-Sing Yang Tzung-ShianTsai 蔡宗憲
author	Tzung-ShianTsai 蔡宗憲
spellingShingle	Tzung-ShianTsai 蔡宗憲 Using Information Retrieval Approach for Malware Classification
author_sort	Tzung-ShianTsai
title	Using Information Retrieval Approach for Malware Classification
title_short	Using Information Retrieval Approach for Malware Classification
title_full	Using Information Retrieval Approach for Malware Classification
title_fullStr	Using Information Retrieval Approach for Malware Classification
title_full_unstemmed	Using Information Retrieval Approach for Malware Classification
title_sort	using information retrieval approach for malware classification
publishDate	2014
url	http://ndltd.ncl.edu.tw/handle/22111470442079737846
work_keys_str_mv	AT tzungshiantsai usinginformationretrievalapproachformalwareclassification AT càizōngxiàn usinginformationretrievalapproachformalwareclassification AT tzungshiantsai lìyòngzīxùnjiǎnsuǒfāngshìyúèyìchéngshìfēnlèizhīyánjiū AT càizōngxiàn lìyòngzīxùnjiǎnsuǒfāngshìyúèyìchéngshìfēnlèizhīyánjiū
_version_	1718199712248496128

Using Information Retrieval Approach for Malware Classification

Similar Items