Application of Clustering in the Non-Parametric Estimation of Distribution Density

This paper discusses a multimodal density function estimation problem of a random vector. A comparative accuracy analysis of some popular non-parametric estimators is made by using the Monte-Carlo method. The paper demonstrates that the estimation quality increases significantly if the sample is cl...

Full description

Bibliographic Details
Main Authors: T. Ruzgas, R. Rudzkis, M. Kavaliauskas
Format: Article
Language:English
Published: Vilnius University Press 2006-11-01
Series:Nonlinear Analysis
Subjects:
Online Access:http://www.journals.vu.lt/nonlinear-analysis/article/view/14741
Description
Summary:This paper discusses a multimodal density function estimation problem of a random vector. A comparative accuracy analysis of some popular non-parametric estimators is made by using the Monte-Carlo method. The paper demonstrates that the estimation quality increases significantly if the sample is clustered (i.e., the multimodal density function is approximated by a mixture of unimodal densities), and later on, the density estimation methods are applied separately to each cluster. In this paper, the sample is clustered using the Gaussian distribution mixture model and the EM algorithm. The highest efficiency in the analysed cases was reached by using the iterative procedure proposed by Friedman for estimating a density component corresponding to each cluster after the primary sample clustering mentioned. The Friedman procedure is based on both the projection pursuit of multivariate observations and transformation of the univariate projections into the standard Gaussian random values (using the density function estimates of these projections).
ISSN:1392-5113
2335-8963