STATISTICAL MODELS UTILIZING DEPENDENCE BETWEEN VARIABLES

We propose a model particularly suitable for modeling the relationship between a dependent variable and a vector of independent variables. Binary response variable and categorical data are of a specific interest in this research. I-projection problems arise in wide variety of problems and different...

Full description

Bibliographic Details
Main Author: Al-Talib, Mohammad M.
Format: Others
Published: OpenSIUC 2014
Online Access:https://opensiuc.lib.siu.edu/dissertations/792
https://opensiuc.lib.siu.edu/cgi/viewcontent.cgi?article=1795&context=dissertations
id ndltd-siu.edu-oai-opensiuc.lib.siu.edu-dissertations-1795
record_format oai_dc
spelling ndltd-siu.edu-oai-opensiuc.lib.siu.edu-dissertations-17952018-12-20T04:31:44Z STATISTICAL MODELS UTILIZING DEPENDENCE BETWEEN VARIABLES Al-Talib, Mohammad M. We propose a model particularly suitable for modeling the relationship between a dependent variable and a vector of independent variables. Binary response variable and categorical data are of a specific interest in this research. I-projection problems arise in wide variety of problems and different settings. It plays a key role in the information theoretic approach to statistics (Kullback,1959 and Good, 1963), e.g. maximization of entropy (Rao 1965, and Jayens,1957) and the theory of large deviations (Sanov, 1957). Maximum likelihood estimation in log-linear models for multinomial distributions is equivalent to solving an I-projection problem (Robertson et. al, 1988). I-projection and Fenchel's duality were used in our research to introduce the proposed model. We consider estimating the parameters of a joint distribution of d random variables by projecting the distribution from the independent case to a convex set C defined to be the intersection of convex cones of probability distributions describing dependence. The goal here is to find the the closest probability distribution in C to the independent probability distribution (vector). The primal problem and its dual are introduced using Fenchel's duality theorem, furthermore the solution of the primal problem is a function of the solution of the dual problem. An algorithm to find the solution is established in the case of two predictor variables and generalized to the case of any d predictor variables, this algorithm is modified from the works of Csiszar (1975), Dykstra (1985), Bhattacharya and Dykstra (1997). Lue and Tsai (2012) proposed a semi-parametric proportional likelihood ratio model which is particularly suitable for modeling a nonlinear monotonic relationship between the outcome variable and a covariate. This model extends the generalized linear model by leaving the distribution unspecified. A theorem that characterizes the solution of the maximization of the log-likelihood function is stated, it was also solved by the profile likelihood approach. An algorithm was established based on the profile likelihood method, but they were not able to find the right conditions to guarantee the convergence of the algorithm. Divergence was not encountered in the real data analysis and simulation study. The logistic regression model is one of the popular statistical models for the analysis of binary data with applications in physical, biomedical, and behavioral sciences, among others. Normally, the asymptotic properties of the maximum likelihood estimates in the model parameters are used for statistical inference. A relationship between the proposed model and logistic regression model is established, and we investigated estimating the proposed model's parameter in perspective of logistic regression. The well-known Newton-Raphson method is studied extensively; an algorithm of the solution is presented. A simulation study is carried out to compare the performance of the Projection method versus the well-known Newton-Raphson method, also the robustness of the two methods is studied, by comparing the two methods when the observed samples are contaminated, and also when the model is misspecified. 2014-05-01T07:00:00Z text application/pdf https://opensiuc.lib.siu.edu/dissertations/792 https://opensiuc.lib.siu.edu/cgi/viewcontent.cgi?article=1795&context=dissertations Dissertations OpenSIUC
collection NDLTD
format Others
sources NDLTD
description We propose a model particularly suitable for modeling the relationship between a dependent variable and a vector of independent variables. Binary response variable and categorical data are of a specific interest in this research. I-projection problems arise in wide variety of problems and different settings. It plays a key role in the information theoretic approach to statistics (Kullback,1959 and Good, 1963), e.g. maximization of entropy (Rao 1965, and Jayens,1957) and the theory of large deviations (Sanov, 1957). Maximum likelihood estimation in log-linear models for multinomial distributions is equivalent to solving an I-projection problem (Robertson et. al, 1988). I-projection and Fenchel's duality were used in our research to introduce the proposed model. We consider estimating the parameters of a joint distribution of d random variables by projecting the distribution from the independent case to a convex set C defined to be the intersection of convex cones of probability distributions describing dependence. The goal here is to find the the closest probability distribution in C to the independent probability distribution (vector). The primal problem and its dual are introduced using Fenchel's duality theorem, furthermore the solution of the primal problem is a function of the solution of the dual problem. An algorithm to find the solution is established in the case of two predictor variables and generalized to the case of any d predictor variables, this algorithm is modified from the works of Csiszar (1975), Dykstra (1985), Bhattacharya and Dykstra (1997). Lue and Tsai (2012) proposed a semi-parametric proportional likelihood ratio model which is particularly suitable for modeling a nonlinear monotonic relationship between the outcome variable and a covariate. This model extends the generalized linear model by leaving the distribution unspecified. A theorem that characterizes the solution of the maximization of the log-likelihood function is stated, it was also solved by the profile likelihood approach. An algorithm was established based on the profile likelihood method, but they were not able to find the right conditions to guarantee the convergence of the algorithm. Divergence was not encountered in the real data analysis and simulation study. The logistic regression model is one of the popular statistical models for the analysis of binary data with applications in physical, biomedical, and behavioral sciences, among others. Normally, the asymptotic properties of the maximum likelihood estimates in the model parameters are used for statistical inference. A relationship between the proposed model and logistic regression model is established, and we investigated estimating the proposed model's parameter in perspective of logistic regression. The well-known Newton-Raphson method is studied extensively; an algorithm of the solution is presented. A simulation study is carried out to compare the performance of the Projection method versus the well-known Newton-Raphson method, also the robustness of the two methods is studied, by comparing the two methods when the observed samples are contaminated, and also when the model is misspecified.
author Al-Talib, Mohammad M.
spellingShingle Al-Talib, Mohammad M.
STATISTICAL MODELS UTILIZING DEPENDENCE BETWEEN VARIABLES
author_facet Al-Talib, Mohammad M.
author_sort Al-Talib, Mohammad M.
title STATISTICAL MODELS UTILIZING DEPENDENCE BETWEEN VARIABLES
title_short STATISTICAL MODELS UTILIZING DEPENDENCE BETWEEN VARIABLES
title_full STATISTICAL MODELS UTILIZING DEPENDENCE BETWEEN VARIABLES
title_fullStr STATISTICAL MODELS UTILIZING DEPENDENCE BETWEEN VARIABLES
title_full_unstemmed STATISTICAL MODELS UTILIZING DEPENDENCE BETWEEN VARIABLES
title_sort statistical models utilizing dependence between variables
publisher OpenSIUC
publishDate 2014
url https://opensiuc.lib.siu.edu/dissertations/792
https://opensiuc.lib.siu.edu/cgi/viewcontent.cgi?article=1795&context=dissertations
work_keys_str_mv AT altalibmohammadm statisticalmodelsutilizingdependencebetweenvariables
_version_ 1718802392601853952