A Study on Anomaly Detection Ensembles

碩士 === 國立臺灣科技大學 === 資訊工程系 === 104 === An anomaly, or outlier, is something that is different from the rest. These differences may ultimately correspond to an object or event of interest, the detection of which often proves to be of great importance or interest. For example fraud, spam, and device ma...

Full description

Bibliographic Details
Main Authors: Alvin Chin-Yen Chiang, 蔣勤彥
Other Authors: Yuh-Jye Lee
Format: Others
Language:en_US
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/15861838708969908230
id ndltd-TW-104NTUS5392007
record_format oai_dc
spelling ndltd-TW-104NTUS53920072017-10-29T04:34:40Z http://ndltd.ncl.edu.tw/handle/15861838708969908230 A Study on Anomaly Detection Ensembles 融合式異常偵測之探究 Alvin Chin-Yen Chiang 蔣勤彥 碩士 國立臺灣科技大學 資訊工程系 104 An anomaly, or outlier, is something that is different from the rest. These differences may ultimately correspond to an object or event of interest, the detection of which often proves to be of great importance or interest. For example fraud, spam, and device malfunctions correspond to events which need to be noticed and to do so we characterize them by their deviation from normality. By automating the creation of a ranking or list of what is most deviant, we can save time and decrease the cognitive overload of the individuals or groups responsible for responding to such events. Over the years many anomaly and outlier metrics and detection methods have been developed for the purpose of finding data incongruencies. In this thesis we review the general strategies and measures used to characterize the `strangeness' of data, as well as how these separate methods may be combined. Under the assumption that ``the crowd is wise'', we adopt an eclectic approach and propose a clustering-based score ensembling method for outlier detection. Using benchmark datasets we evaluate quantitatively the robustness and accuracy of different ensemble strategies. We find that ensembling strategies offer only limited value for increasing overall performance, but provide robustness and protection from underperforming models. We also discuss the use of randomization to create ensemble-based methods. Based on our results we conclude that, given the current state-of-the-art, unsupervised anomaly detection faces significant challenges. Yuh-Jye Lee 李育杰 2015 學位論文 ; thesis 60 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立臺灣科技大學 === 資訊工程系 === 104 === An anomaly, or outlier, is something that is different from the rest. These differences may ultimately correspond to an object or event of interest, the detection of which often proves to be of great importance or interest. For example fraud, spam, and device malfunctions correspond to events which need to be noticed and to do so we characterize them by their deviation from normality. By automating the creation of a ranking or list of what is most deviant, we can save time and decrease the cognitive overload of the individuals or groups responsible for responding to such events. Over the years many anomaly and outlier metrics and detection methods have been developed for the purpose of finding data incongruencies. In this thesis we review the general strategies and measures used to characterize the `strangeness' of data, as well as how these separate methods may be combined. Under the assumption that ``the crowd is wise'', we adopt an eclectic approach and propose a clustering-based score ensembling method for outlier detection. Using benchmark datasets we evaluate quantitatively the robustness and accuracy of different ensemble strategies. We find that ensembling strategies offer only limited value for increasing overall performance, but provide robustness and protection from underperforming models. We also discuss the use of randomization to create ensemble-based methods. Based on our results we conclude that, given the current state-of-the-art, unsupervised anomaly detection faces significant challenges.
author2 Yuh-Jye Lee
author_facet Yuh-Jye Lee
Alvin Chin-Yen Chiang
蔣勤彥
author Alvin Chin-Yen Chiang
蔣勤彥
spellingShingle Alvin Chin-Yen Chiang
蔣勤彥
A Study on Anomaly Detection Ensembles
author_sort Alvin Chin-Yen Chiang
title A Study on Anomaly Detection Ensembles
title_short A Study on Anomaly Detection Ensembles
title_full A Study on Anomaly Detection Ensembles
title_fullStr A Study on Anomaly Detection Ensembles
title_full_unstemmed A Study on Anomaly Detection Ensembles
title_sort study on anomaly detection ensembles
publishDate 2015
url http://ndltd.ncl.edu.tw/handle/15861838708969908230
work_keys_str_mv AT alvinchinyenchiang astudyonanomalydetectionensembles
AT jiǎngqínyàn astudyonanomalydetectionensembles
AT alvinchinyenchiang rónghéshìyìchángzhēncèzhītànjiū
AT jiǎngqínyàn rónghéshìyìchángzhēncèzhītànjiū
AT alvinchinyenchiang studyonanomalydetectionensembles
AT jiǎngqínyàn studyonanomalydetectionensembles
_version_ 1718558221776453632