A Study on Anomaly Detection Ensembles
碩士 === 國立臺灣科技大學 === 資訊工程系 === 104 === An anomaly, or outlier, is something that is different from the rest. These differences may ultimately correspond to an object or event of interest, the detection of which often proves to be of great importance or interest. For example fraud, spam, and device ma...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2015
|
Online Access: | http://ndltd.ncl.edu.tw/handle/15861838708969908230 |
id |
ndltd-TW-104NTUS5392007 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-104NTUS53920072017-10-29T04:34:40Z http://ndltd.ncl.edu.tw/handle/15861838708969908230 A Study on Anomaly Detection Ensembles 融合式異常偵測之探究 Alvin Chin-Yen Chiang 蔣勤彥 碩士 國立臺灣科技大學 資訊工程系 104 An anomaly, or outlier, is something that is different from the rest. These differences may ultimately correspond to an object or event of interest, the detection of which often proves to be of great importance or interest. For example fraud, spam, and device malfunctions correspond to events which need to be noticed and to do so we characterize them by their deviation from normality. By automating the creation of a ranking or list of what is most deviant, we can save time and decrease the cognitive overload of the individuals or groups responsible for responding to such events. Over the years many anomaly and outlier metrics and detection methods have been developed for the purpose of finding data incongruencies. In this thesis we review the general strategies and measures used to characterize the `strangeness' of data, as well as how these separate methods may be combined. Under the assumption that ``the crowd is wise'', we adopt an eclectic approach and propose a clustering-based score ensembling method for outlier detection. Using benchmark datasets we evaluate quantitatively the robustness and accuracy of different ensemble strategies. We find that ensembling strategies offer only limited value for increasing overall performance, but provide robustness and protection from underperforming models. We also discuss the use of randomization to create ensemble-based methods. Based on our results we conclude that, given the current state-of-the-art, unsupervised anomaly detection faces significant challenges. Yuh-Jye Lee 李育杰 2015 學位論文 ; thesis 60 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣科技大學 === 資訊工程系 === 104 === An anomaly, or outlier, is something that is different from the rest. These differences may ultimately correspond to an object or event of interest, the detection of which often proves to be of great importance or interest. For example fraud, spam, and device malfunctions correspond to events which need to be noticed and to do so we characterize them by their deviation from normality. By automating the creation of a ranking or list of what is most deviant, we can save time and decrease the cognitive overload of the individuals or groups responsible for responding to such events.
Over the years many anomaly and outlier metrics and detection methods have been developed for the purpose of finding data incongruencies. In this thesis we review the general strategies and measures used to characterize the `strangeness' of data, as well as how these separate methods may be combined. Under the assumption that ``the crowd is wise'', we adopt an eclectic approach and propose a clustering-based score ensembling method for outlier detection. Using benchmark datasets we evaluate quantitatively the robustness and accuracy of different ensemble strategies. We find that ensembling strategies offer only limited value for increasing overall performance, but provide robustness and protection from underperforming models. We also discuss the use of randomization to create ensemble-based methods. Based on our results we conclude that, given the current state-of-the-art, unsupervised anomaly detection faces significant challenges.
|
author2 |
Yuh-Jye Lee |
author_facet |
Yuh-Jye Lee Alvin Chin-Yen Chiang 蔣勤彥 |
author |
Alvin Chin-Yen Chiang 蔣勤彥 |
spellingShingle |
Alvin Chin-Yen Chiang 蔣勤彥 A Study on Anomaly Detection Ensembles |
author_sort |
Alvin Chin-Yen Chiang |
title |
A Study on Anomaly Detection Ensembles |
title_short |
A Study on Anomaly Detection Ensembles |
title_full |
A Study on Anomaly Detection Ensembles |
title_fullStr |
A Study on Anomaly Detection Ensembles |
title_full_unstemmed |
A Study on Anomaly Detection Ensembles |
title_sort |
study on anomaly detection ensembles |
publishDate |
2015 |
url |
http://ndltd.ncl.edu.tw/handle/15861838708969908230 |
work_keys_str_mv |
AT alvinchinyenchiang astudyonanomalydetectionensembles AT jiǎngqínyàn astudyonanomalydetectionensembles AT alvinchinyenchiang rónghéshìyìchángzhēncèzhītànjiū AT jiǎngqínyàn rónghéshìyìchángzhēncèzhītànjiū AT alvinchinyenchiang studyonanomalydetectionensembles AT jiǎngqínyàn studyonanomalydetectionensembles |
_version_ |
1718558221776453632 |