Foundations of Perturbation Robust Clustering

Clustering is a fundamental data mining tool that aims to divide data into groups of similar items. Intuition about clustering reflects the ideal case -- exact data sets endowed with flawless dissimilarity between individual instances. In practice however, these cases are in the minority, and cluste...

Full description

Bibliographic Details
Other Authors: Moore, Jarrod (authoraut)
Format: Others
Language:English
English
Published: Florida State University
Subjects:
Online Access:http://purl.flvc.org/fsu/fd/FSU_SUMMER2017_Moore_fsu_0071N_13913
Description
Summary:Clustering is a fundamental data mining tool that aims to divide data into groups of similar items. Intuition about clustering reflects the ideal case -- exact data sets endowed with flawless dissimilarity between individual instances. In practice however, these cases are in the minority, and clustering applications are typically characterized by noisy data sets with approximate pairwise dissimilarities. As such, the efficacy of clustering methods necessitates robustness to perturbations. In this paper, we address foundational questions on perturbation robustness, studying to what extent can clustering techniques exhibit this desirable characteristic. Our results also demonstrate the type of cluster structures required for robustness of popular clustering paradigms. === A Thesis submitted to the Department of Computer Science in partial fulfillment of the requirements for the degree of Master of Science. === Summer Semester 2017. === May 4, 2017. === Includes bibliographical references. === Margareta Ackerman, Professor Co-Directing Thesis; Gary Tyson, Professor Co-Directing Thesis; Sonia Haiduc, Committee Member; Peixiang Zhao, Committee Member.