Tau-Path Test - A Nonparametric Test For Testing Unspecified Subpopulation Monotone Association

Bibliographic Details
Main Author: Yu, Li
Language:English
Published: The Ohio State University / OhioLINK 2009
Subjects:
Online Access:http://rave.ohiolink.edu/etdc/view?acc_num=osu1255657068
id ndltd-OhioLink-oai-etd.ohiolink.edu-osu1255657068
record_format oai_dc
spelling ndltd-OhioLink-oai-etd.ohiolink.edu-osu12556570682021-08-03T05:57:21Z Tau-Path Test - A Nonparametric Test For Testing Unspecified Subpopulation Monotone Association Yu, Li Statistics Concordance matrix copula drug assay microarray mixture permutation quassinoids bivariate Mallow's model multistage In data mining and other settings, there is sometimes a need to identify relationships between variables when the relationship may hold only over a subset of the observations available. For example, expression of a particular gene may cause resistance to an anticancer drug, but only over certain types of cancer cell-lines. It may not be known in advance which types of cancer cell-lines (e.g., estrogen-regulated, newly differentiated, central nervous system) employ such a method of resistance. This situation differs from the usual setting in which partial correlations are estimated conditional on a known selection, such as the value of another variable. For any pair of variables of interest, the goal is to test if these are associated in some unspecified subpopulation that is represented by a subsample of the data we have available. Nothing in the literature deals directly with this problem. We have tried several parametric and non-parametric approaches, and for both inferential and computational reasons have chosen to present a procedure based on a sequential development of Kendall's tau measure of monotone association. The sequence is achieved by reordering observations so that the sample tau coefficients for the first k of the n observations form a monotone decreasing path, ending at Kendall's tau coefficient. Boundaries are constructed so that 95% of the paths remain within the boundaries under the null hypothesis of independence. A boundary crossing at any point k is evidence of a stronger than expected association amongst a subpopulation represented by the k observations involved. The method is used to screen for association between gene expression and compound activity amongst types of cancer cell-lines in the NCI-60 database. We prove that a particular method of reordering the observations is optimal against any other ordering for simultaneously identifying the highest Kendall's tau association in subsets of size k (k = 2,...,n). Furthermore, assuming a subpopulation of k, we present a way of quantifying how likely any observation is to be in that subpopulation. From the statistical model point of view, first we show that the semi-parametric bivariate Mallow's model provides a good tool to model the paired empirical ranking through Kendall's distance for bivariate samples from parametric copula models. Then the Mallow's model can also model the bivariate samples from special three/two components (Frechet-Hoeffding upper bound copula, positively associated copula and independence copula) copula models. At last, a generalize permutation governed n-stage bivariate Mallow's model is proposed to model n independent bivariate samples from n copulas from the same family but with n different association parameters. It is shown that both the power of the Tau-path test in detecting subpopulation association and the ability of the method to identify the associated subpopulation does not depend on the copula models for positively associated copula in the two-copula mixture (positively associated copula and independent copula) case. In the n-copula mixture case, the tau-path method can be applied to put the n observations in an order close to the order of the strength of association for their parent copula models when there are a reasonably proportion of these copulas with moderate or strong association. 2009 English text The Ohio State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=osu1255657068 http://rave.ohiolink.edu/etdc/view?acc_num=osu1255657068 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection NDLTD
language English
sources NDLTD
topic Statistics
Concordance matrix
copula
drug assay
microarray
mixture
permutation
quassinoids
bivariate
Mallow's model
multistage
spellingShingle Statistics
Concordance matrix
copula
drug assay
microarray
mixture
permutation
quassinoids
bivariate
Mallow's model
multistage
Yu, Li
Tau-Path Test - A Nonparametric Test For Testing Unspecified Subpopulation Monotone Association
author Yu, Li
author_facet Yu, Li
author_sort Yu, Li
title Tau-Path Test - A Nonparametric Test For Testing Unspecified Subpopulation Monotone Association
title_short Tau-Path Test - A Nonparametric Test For Testing Unspecified Subpopulation Monotone Association
title_full Tau-Path Test - A Nonparametric Test For Testing Unspecified Subpopulation Monotone Association
title_fullStr Tau-Path Test - A Nonparametric Test For Testing Unspecified Subpopulation Monotone Association
title_full_unstemmed Tau-Path Test - A Nonparametric Test For Testing Unspecified Subpopulation Monotone Association
title_sort tau-path test - a nonparametric test for testing unspecified subpopulation monotone association
publisher The Ohio State University / OhioLINK
publishDate 2009
url http://rave.ohiolink.edu/etdc/view?acc_num=osu1255657068
work_keys_str_mv AT yuli taupathtestanonparametrictestfortestingunspecifiedsubpopulationmonotoneassociation
_version_ 1719428387846488064