کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
416588 681388 2007 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Efficient algorithms for computing the best subset regression models for large-scale problems
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Efficient algorithms for computing the best subset regression models for large-scale problems
چکیده انگلیسی

Several strategies for computing the best subset regression models are proposed. Some of the algorithms are modified versions of existing regression-tree methods, while others are new. The first algorithm selects the best subset models within a given size range. It uses a reduced search space and is found to outperform computationally the existing branch-and-bound algorithm. The properties and computational aspects of the proposed algorithm are discussed in detail. The second new algorithm preorders the variables inside the regression tree. A radius is defined in order to measure the distance of a node from the root of the tree. The algorithm applies the preordering to all nodes which have a smaller distance than a certain radius that is given a priori. An efficient method of preordering the variables is employed. The experimental results indicate that the algorithm performs best when preordering is employed on a radius of between one quarter and one third of the number of variables. The algorithm has been applied with such a radius to tackle large-scale subset-selection problems that are considered to be computationally infeasible by conventional exhaustive-selection methods. A class of new heuristic strategies is also proposed. The most important of these is one that assigns a different tolerance value to each subset model size. This strategy with different kind of tolerances is equivalent to all exhaustive and heuristic subset-selection strategies. In addition the strategy can be used to investigate submodels having noncontiguous size ranges. Its implementation provides a flexible tool for tackling large scale models.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Statistics & Data Analysis - Volume 52, Issue 1, 15 September 2007, Pages 16–29
نویسندگان
, , ,