Summary: | 博士 === 國立臺灣大學 === 資訊工程學研究所 === 106 === The study of density estimation has produced algorithms that has been used across many disciplines and has become a common fixture in the analysis of data. However density estimation has not been able to perform well on high-dimensional datasets. In this study, we discuss the reasons that traditional density estimation would not work well for high dimensional data. Why they give values that are uninterpretable, with either the values so low that the values may be greatly affected by the model noise or computational noise, or the values are so high where we cannot compute the ratio of infinity over infinity. This study proposes using negative log distance to k nearest neighbors as the metric to compare when the dimension of the samples are not known. The resulting classifier, HDDE, was used to classify images in domains with close to 100k dimensions with reasonable results.
|