Big Data…a few Outliers = Big Mistakes. Un nuovo processo per l’individuazione di outliers

The search and identification of outliers is a fundamental step, generally preparatory to the elaborations aimed at obtaining consistent results. The new approach devised for the identification of outliers in space R2 benefits from geometric / statistical techniques largely independent from the type...

Full description

Bibliographic Details
Main Author: Maurizio Rosina
Format: Article
Language:English
Published: mediaGEO soc. coop. 2018-05-01
Series:GEOmedia
Online Access:http://mediageo.it/ojs/index.php/GEOmedia/article/view/1520
Description
Summary:The search and identification of outliers is a fundamental step, generally preparatory to the elaborations aimed at obtaining consistent results. The new approach devised for the identification of outliers in space R2 benefits from geometric / statistical techniques largely independent from the type of data distribution, and is based on four methodological pillars: clustering, the convex hull peeling technique, a specific metric and Chebyshev’s inequality, which is valid for any type of univariate distribution of values. The modularity and the generality of the approach, coupled to the research and identification of outliers based on strictly statistical parameters, make the approach presented a useful and daily tool for those who need to process bivariate data with the security of being able to previously identify outliers.
ISSN:1128-8132
2283-5687