A geometric view on Pearson’s correlation coefficient and a generalization of it to non-linear dependencies

Measuring strength or degree of statistical dependence between two random variables is a common problem in many domains. Pearson’s correlation coefficient ρ is an accurate measure of linear dependence. We show that ρ is a normalized, Euclidean type distance between joint probability distribution of...

Full description

Bibliographic Details
Main Author:	Priyantha Wijayatunga
Format:	Article
Language:	English
Published:	Accademia Piceno Aprutina dei Velati 2016-06-01
Series:	Ratio Mathematica
Subjects:	metric/distance probability simplex normalization
Online Access:	http://eiris.it/ojs/index.php/ratiomathematica/article/view/5

id	doaj-415f5ce80b10468c9e6da56ccb327fb0
record_format	Article
spelling	doaj-415f5ce80b10468c9e6da56ccb327fb02020-11-24T22:49:54ZengAccademia Piceno Aprutina dei VelatiRatio Mathematica1592-74152282-82142016-06-0130132110.23755/rm.v30i1.513A geometric view on Pearson’s correlation coefficient and a generalization of it to non-linear dependenciesPriyantha Wijayatunga0Department of Statistics, Umeå School of Business and Economics, Umeå University, Umeå 901 87, SwedenMeasuring strength or degree of statistical dependence between two random variables is a common problem in many domains. Pearson’s correlation coefficient ρ is an accurate measure of linear dependence. We show that ρ is a normalized, Euclidean type distance between joint probability distribution of the two random variables and that when their independence is assumed while keeping their marginal distributions. And the normalizing constant is the geometric mean of two maximal distances; each between the joint probability distribution when the full linear dependence is assumed while preserving respective marginal distribution and that when the independence is assumed. Usage of it is restricted to linear dependence because it is based on Euclidean type distances that are generally not metrics and considered full dependence is linear. Therefore, we argue that if a suitable distance metric is used while considering all possible maximal dependences then it can measure any non-linear dependence. But then, one must define all the full dependences. Hellinger distance that is a metric can be used as the distance measure between probability distributions and obtain a generalization of ρ for the discrete case.http://eiris.it/ojs/index.php/ratiomathematica/article/view/5metric/distanceprobability simplexnormalization
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Priyantha Wijayatunga
spellingShingle	Priyantha Wijayatunga A geometric view on Pearson’s correlation coefficient and a generalization of it to non-linear dependencies Ratio Mathematica metric/distance probability simplex normalization
author_facet	Priyantha Wijayatunga
author_sort	Priyantha Wijayatunga
title	A geometric view on Pearson’s correlation coefficient and a generalization of it to non-linear dependencies
title_short	A geometric view on Pearson’s correlation coefficient and a generalization of it to non-linear dependencies
title_full	A geometric view on Pearson’s correlation coefficient and a generalization of it to non-linear dependencies
title_fullStr	A geometric view on Pearson’s correlation coefficient and a generalization of it to non-linear dependencies
title_full_unstemmed	A geometric view on Pearson’s correlation coefficient and a generalization of it to non-linear dependencies
title_sort	geometric view on pearson’s correlation coefficient and a generalization of it to non-linear dependencies
publisher	Accademia Piceno Aprutina dei Velati
series	Ratio Mathematica
issn	1592-7415 2282-8214
publishDate	2016-06-01
description	Measuring strength or degree of statistical dependence between two random variables is a common problem in many domains. Pearson’s correlation coefficient ρ is an accurate measure of linear dependence. We show that ρ is a normalized, Euclidean type distance between joint probability distribution of the two random variables and that when their independence is assumed while keeping their marginal distributions. And the normalizing constant is the geometric mean of two maximal distances; each between the joint probability distribution when the full linear dependence is assumed while preserving respective marginal distribution and that when the independence is assumed. Usage of it is restricted to linear dependence because it is based on Euclidean type distances that are generally not metrics and considered full dependence is linear. Therefore, we argue that if a suitable distance metric is used while considering all possible maximal dependences then it can measure any non-linear dependence. But then, one must define all the full dependences. Hellinger distance that is a metric can be used as the distance measure between probability distributions and obtain a generalization of ρ for the discrete case.
topic	metric/distance probability simplex normalization
url	http://eiris.it/ojs/index.php/ratiomathematica/article/view/5
work_keys_str_mv	AT priyanthawijayatunga ageometricviewonpearsonscorrelationcoefficientandageneralizationofittononlineardependencies AT priyanthawijayatunga geometricviewonpearsonscorrelationcoefficientandageneralizationofittononlineardependencies
_version_	1725674409281191936

A geometric view on Pearson’s correlation coefficient and a generalization of it to non-linear dependencies

Similar Items