A diagnostic function to examine candidate distributions to model univariate data

Master of Science === Department of Statistics === Suzanne Dubnicka === To help with identifying distributions to effectively model univariate continuous data, the R function diagnostic is proposed. The function will aid in determining reasonable candidate distributions that the data may have come f...

Full description

Bibliographic Details
Main Author: Richards, John
Language:en_US
Published: Kansas State University 2010
Subjects:
R
Online Access:http://hdl.handle.net/2097/4093
Description
Summary:Master of Science === Department of Statistics === Suzanne Dubnicka === To help with identifying distributions to effectively model univariate continuous data, the R function diagnostic is proposed. The function will aid in determining reasonable candidate distributions that the data may have come from. It uses a combination of the Pearson goodness of fit statistic, Anderson-Darling statistic, Lin’s concordance correlation between the theoretical quantiles and observed quantiles, and the maximum difference between the theoretical quantiles and the observed quantiles. The function generates reasonable candidate distributions, QQ plots, and histograms with superimposed density curves. When a simulation study was done, the function worked adequately; however, it was also found that many of the distributions look very similar if the parameters are chosen carefully. The function was then used to attempt to decipher which distribution could be used to model weekly grocery expenditures of a family household.