On two-sample data analysis by exponential model [electronic resource] /

by Choi, Sujung

Abstract (Summary)
We discuss two-sample problems and the implementation of a new two-sample dataanalysis procedure. The proposed procedure is based on the concepts of mid-distribution, design of score functions, components, comparison distribution, comparison density and exponential model. Assume that we have a random sample X?, . . . ,X[subscript]m from a continuous distribution F(y) = P(X[subscript]i [less than or equal to] y), i = 1, . . . ,m and a random sample Y?, . . . ,Y[subscript]n from a continuous distribution G(y) = P(Y[subscript]i [less than or equal to] y), i = 1, . . . ,n. Also assume independence of the two samples. The two-sample problem tests homogeneity of two samples and formally can be stated as H? : F = G. To solve the two-sample problem, a number of tests have been proposed by statisticians in various contexts. Two typical tests are the two-sample t-test and the Wilcoxon's rank sum test. However, since they are testing differences in locations, they do not extract more information from the data as well as a test of the homogeneity of the distribution functions. Even though the Kolmogorov-Smirnov test statistic or Anderson-Darling tests can be used for the test of H? : F = G, those statistics give no indication of the actual relation of F to G when H? : F = G is rejected. Our goal is to learn why it was rejected. Our approach gives an answer using graphical tools which is a main property of our approach. Our approach is functional in the sense that the parameters to be estimated are probability density functions. Compared with other statistical tools for two-sample problems such as the t-test or the Wilcoxon rank-sum test, density estimation makes us understand the data more fully, which is essential in data analysis. Our approach to density estimation works with small sample sizes, too. Also our methodology makes almost no assumptions on two continuous distributions F and G. In that sense, our approach is nonparametric. Our approach gives graphical elements in two-sample problem where exist not many graphical elements typically.Furthermore, our procedure will help researchers to make a conclusion as to why two populations are different when H? is rejected and to give an explanation to describe the relation between F and G in a graphical way.
Bibliographical Information:


School:Texas A&M International University

School Location:USA - Texas

Source Type:Master's Thesis

Keywords:major statistics two sample problem comparison distribution function exponential model density data analysis procedure


Date of Publication:

© 2009 All Rights Reserved.