Contributions to the Theory of Measures of Association for Ordinal Variables
In this thesis, we consider measures of association for ordinal variables from a theoretical perspective. In particular, we study the phi-coefficient, the tetrachoric correlation coefficient and the polychoric correlation coefficient. We also introduce a new measure of association for ordinal variab...
Main Author: | |
---|---|
Format: | Doctoral Thesis |
Language: | English |
Published: |
Uppsala universitet, Statistik
2009
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-100735 http://nbn-resolving.de/urn:isbn:978-91-554-7498-0 |
Summary: | In this thesis, we consider measures of association for ordinal variables from a theoretical perspective. In particular, we study the phi-coefficient, the tetrachoric correlation coefficient and the polychoric correlation coefficient. We also introduce a new measure of association for ordinal variables, the empirical polychoric correlation coefficient, which has better theoretical properties than the polychoric correlation coefficient, including greatly enhanced robustness. In the first article, entitled ``On the relation between the phi-coefficient and the tetrachoric correlation coefficient'', we show that under given marginal probabilities there exists a continuous bijection between the two measures of association. Furthermore, we show that the bijection has a fixed point at zero for all marginal probabilities. Consequently, the choice of which of these measures of association to use is for all practical purposes a matter of preference only. In the second article, entitled ``A generalized definition of the tetrachoric correlation coefficient'', we generalize the tetrachoric correlation coefficient so that a large class of parametric families of bivariate distributions can be assumed as underlying distributions. We also provide a necessary and sufficient condition for the generalized tetrachoric correlation coefficient to be well defined for a given parametric family of bivariate distributions. With examples, we illustrate the effects on the polychoric correlation coefficient of different distributional assumptions. In the third article, entitled ``A generalized definition of the polychoric correlation coefficient'', we generalize the polychoric correlation coefficient to a large class of parametric families of bivariate distributions, and show that the generalized and the conventional polychoric correlation coefficients agree on the family of bivariate normal distributions. With examples, we illustrate the effects of different distributional assumptions on the polychoric correlation coefficient. In combination with goodness-of-fit p-values, the association analysis can be enriched with a consideration of possible tail dependence. In the fourth article, we propose a new measure of association for ordinal variables, named the empirical polychoric correlation coefficient. The empirical polychoric correlation coefficient relaxes the fundamental assumption of the polychoric correlation coefficient so that an underlying joint distribution is only assumed to exist, not to be of a particular parametric family. We also provide an asymptotical result, by which the empirical polychoric correlation coefficient converges almost surely to the true polychoric correlation under very general conditions. Thus, the proposed empirical polychoric correlation coefficient has better theoretical properties than the polychoric correlation coefficient. |
---|