Mean preservation in censored regression using preliminary nonparametric smoothing

by Heuchenne, Cédric

Abstract (Summary)
In this thesis, we consider the problem of estimating the regression function in location-scale regression models. This model assumes that the random vector (X,Y) satisfies Y = m(X) + s(X)e, where m(.) is an unknown location function (e.g. conditional mean, median, truncated mean,...), s(.) is an unknown scale function, and e is independent of X. The response Y is subject to random right censoring, and the covariate X is completely observed. In the first part of the thesis, we assume that m(x) = E(Y|X=x) follows a polynomial model. A new estimation procedure for the unknown regression parameters is proposed, which extends the classical least squares procedure to censored data. The proposed method is inspired by the method of Buckley and James (1979), but is, unlike the latter method, a non-iterative procedure due to nonparametric preliminary estimation. The asymptotic normality of the estimators is established. Simulations are carried out for both methods and they show that the proposed estimators have usually smaller variance and smaller mean squared error than the Buckley-James estimators. For the second part, suppose that m(.)=E(Y|.) belongs to some parametric class of regression functions. A new estimation procedure for the true, unknown vector of parameters is proposed, that extends the classical least squares procedure for nonlinear regression to the case where the response is subject to censoring. The proposed technique uses new `synthetic' data points that are constructed by using a nonparametric relation between Y and X. The consistency and asymptotic normality of the proposed estimator are established, and the estimator is compared via simulations with an estimator proposed by Stute in 1999. In the third part, we study the nonparametric estimation of the regression function m(.). It is well known that the completely nonparametric estimator of the conditional distribution F(.|x) of Y given X=x suffers from inconsistency problems in the right tail (Beran, 1981), and hence the location function m(x) cannot be estimated consistently in a completely nonparametric way, whenever m(x) involves the right tail of F(.|x) (like e.g. for the conditional mean). We propose two alternative estimators of m(x), that do not share the above inconsistency problems. The idea is to make use of the assumed location-scale model, in order to improve the estimation of F(.|x), especially in the right tail. We obtain the asymptotic properties of the two proposed estimators of m(x). Simulations show that the proposed estimators outperform the completely nonparametric estimator in many cases.
Bibliographical Information:


School:Université catholique de Louvain

School Location:Belgium

Source Type:Master's Thesis

Keywords:kernel estimation location scale model least squares survival analysis right censoring bandwidth selection linear regression nonparametric fatigue life data censored


Date of Publication:08/18/2005

© 2009 All Rights Reserved.