کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
416195 681296 2007 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Bayesian identification of clustered outliers in multiple regression
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Bayesian identification of clustered outliers in multiple regression
چکیده انگلیسی

We propose a Bayesian model for clustered outliers in multiple regression. In the literature, outliers are frequently modeled as coming from a subgroup where the variance of the errors is much larger than in the rest of the data. By contrast, when a cluster of outliers exists, we show that it can be more informative to model them as coming from a subgroup where different regression coefficients hold. We can explicitly model the clustering phenomenon by assuming that the probability of an outlier is a function of the explanatory variables. Fitting proceeds via the Gibbs sampler, using the Metropolis–Hastings algorithm to produce variates from the more unusual distributions. Initialization uses a least median of squares fit, and in some ways this method can be viewed as a Bayesian version of the many algorithms that use this fit as a start to some more efficient estimator. This method works very well in a variety of test data sets. We illustrate its use in a data set of sailboat prices, where it yields information both on the identity of the outliers and on their location, spread, and the regression coefficients inside the minority subgroup.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Statistics & Data Analysis - Volume 51, Issue 8, 1 May 2007, Pages 3955–3967
نویسندگان
,