Robustified L2 boosting

Article ID	Journal	Published Year	Pages	File Type
417233	Computational Statistics & Data Analysis	2008	11 Pages	PDF

Abstract

Five robustifications of L2L2 boosting for linear regression with various robustness properties are considered. The first two use the Huber loss as implementing loss function for boosting and the second two use robust simple linear regression for the fitting in L2L2 boosting (i.e. robust base learners). Both concepts can be applied with or without down-weighting of leverage points. Our last method uses robust correlation estimates and appears to be most robust. Crucial advantages of all methods are that they do not compute covariance matrices of all covariates and that they do not have to identify multivariate leverage points. When there are no outliers, the robust methods are only slightly worse than L2L2 boosting. In the contaminated case though, the robust methods outperform L2L2 boosting by a large margin. Some of the robustifications are also computationally highly efficient and therefore well suited for truly high-dimensional problems.