Ball: An R Package for Detecting Distribution Difference and Association in Metric Spaces

The rapid development of modern technology has created many complex datasets in non-linear spaces, while most of the statistical hypothesis tests are only available in Euclidean or Hilbert spaces. To properly analyze the data with more complicated structures, efforts have been made to solve the fund...

Full description

Bibliographic Details
Main Authors: Jin Zhu, Wenliang Pan, Wei Zheng, Xueqin Wang
Format: Article
Language:English
Published: Foundation for Open Access Statistics 2021-03-01
Series:Journal of Statistical Software
Subjects:
Online Access:https://www.jstatsoft.org/index.php/jss/article/view/3599
id doaj-e69cdcd8cd9641e1814bffb54c810e5d
record_format Article
spelling doaj-e69cdcd8cd9641e1814bffb54c810e5d2021-05-04T00:11:49ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602021-03-0197113110.18637/jss.v097.i061413Ball: An R Package for Detecting Distribution Difference and Association in Metric SpacesJin ZhuWenliang PanWei ZhengXueqin WangThe rapid development of modern technology has created many complex datasets in non-linear spaces, while most of the statistical hypothesis tests are only available in Euclidean or Hilbert spaces. To properly analyze the data with more complicated structures, efforts have been made to solve the fundamental test problems in more general spaces (Lyons 2013; Pan, Tian, Wang, and Zhang 2018; Pan, Wang, Zhang, Zhu, and Zhu 2020). In this paper, we introduce a publicly available R package Ball for the comparison of multiple distributions and the test of mutual independence in metric spaces, which extends the test procedures for the equality of two distributions (Pan et al. 2018) and the independence of two random objects (Pan et al. 2020). The Ball package is computationally efficient since several novel algorithms as well as engineering techniques are employed in speeding up the ball test procedures. Two real data analyses and diverse numerical studies have been performed, and the results certify that the Ball package can detect various distribution differences and complicated dependencies in complex datasets, e.g., directional data and symmetric positive definite matrix data.https://www.jstatsoft.org/index.php/jss/article/view/3599k-sample testtest of mutual independenceball divergenceball covariancemetric space
collection DOAJ
language English
format Article
sources DOAJ
author Jin Zhu
Wenliang Pan
Wei Zheng
Xueqin Wang
spellingShingle Jin Zhu
Wenliang Pan
Wei Zheng
Xueqin Wang
Ball: An R Package for Detecting Distribution Difference and Association in Metric Spaces
Journal of Statistical Software
k-sample test
test of mutual independence
ball divergence
ball covariance
metric space
author_facet Jin Zhu
Wenliang Pan
Wei Zheng
Xueqin Wang
author_sort Jin Zhu
title Ball: An R Package for Detecting Distribution Difference and Association in Metric Spaces
title_short Ball: An R Package for Detecting Distribution Difference and Association in Metric Spaces
title_full Ball: An R Package for Detecting Distribution Difference and Association in Metric Spaces
title_fullStr Ball: An R Package for Detecting Distribution Difference and Association in Metric Spaces
title_full_unstemmed Ball: An R Package for Detecting Distribution Difference and Association in Metric Spaces
title_sort ball: an r package for detecting distribution difference and association in metric spaces
publisher Foundation for Open Access Statistics
series Journal of Statistical Software
issn 1548-7660
publishDate 2021-03-01
description The rapid development of modern technology has created many complex datasets in non-linear spaces, while most of the statistical hypothesis tests are only available in Euclidean or Hilbert spaces. To properly analyze the data with more complicated structures, efforts have been made to solve the fundamental test problems in more general spaces (Lyons 2013; Pan, Tian, Wang, and Zhang 2018; Pan, Wang, Zhang, Zhu, and Zhu 2020). In this paper, we introduce a publicly available R package Ball for the comparison of multiple distributions and the test of mutual independence in metric spaces, which extends the test procedures for the equality of two distributions (Pan et al. 2018) and the independence of two random objects (Pan et al. 2020). The Ball package is computationally efficient since several novel algorithms as well as engineering techniques are employed in speeding up the ball test procedures. Two real data analyses and diverse numerical studies have been performed, and the results certify that the Ball package can detect various distribution differences and complicated dependencies in complex datasets, e.g., directional data and symmetric positive definite matrix data.
topic k-sample test
test of mutual independence
ball divergence
ball covariance
metric space
url https://www.jstatsoft.org/index.php/jss/article/view/3599
work_keys_str_mv AT jinzhu ballanrpackagefordetectingdistributiondifferenceandassociationinmetricspaces
AT wenliangpan ballanrpackagefordetectingdistributiondifferenceandassociationinmetricspaces
AT weizheng ballanrpackagefordetectingdistributiondifferenceandassociationinmetricspaces
AT xueqinwang ballanrpackagefordetectingdistributiondifferenceandassociationinmetricspaces
_version_ 1721482131976224768