Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
1146152 | Journal of Multivariate Analysis | 2010 | 20 Pages |
Let X1,…,Xn be identically distributed random vectors in RdRd, independently drawn according to some probability density. An observation Xi is said to be a layered nearest neighbour (LNN) of a point x if the hyperrectangle defined by x and Xi contains no other data points. We first establish consistency results on Ln(x), the number of LNN of x. Then, given a sample (X,Y),(X1,Y1),…,(Xn,Yn) of independent identically distributed random vectors from Rd×RRd×R, one may estimate the regression function r(x)=E[Y|X=x] by the LNN estimate rn(x), defined as an average over the YiYi’s corresponding to those Xi which are LNN of x. Under mild conditions on rr, we establish the consistency of E|rn(x)−r(x)|p towards 00 as n→∞n→∞, for almost all x and all p≥1p≥1, and discuss the links between rnrn and the random forest estimates of Breiman (2001) [8]. We finally show the universal consistency of the bagged (bootstrap-aggregated) nearest neighbour method for regression and classification.