کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
533996 870201 2013 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Assessing similarity of feature selection techniques in high-dimensional domains
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Assessing similarity of feature selection techniques in high-dimensional domains
چکیده انگلیسی


• We propose a methodology for assessing the similarity of feature selection methods.
• An empirical study has been conducted on the genomics domain.
• The study involves some of the most popular selection methods.
• A similarity trend has been derived for feature subsets of increasing size.

Recent research efforts attempt to combine multiple feature selection techniques instead of using a single one. However, this combination is often made on an “ad hoc” basis, depending on the specific problem at hand, without considering the degree of diversity/similarity of the involved methods. Moreover, though it is recognized that different techniques may return quite dissimilar outputs, especially in high dimensional/small sample size domains, few direct comparisons exist that quantify these differences and their implications on classification performance. This paper aims to provide a contribution in this direction by proposing a general methodology for assessing the similarity between the outputs of different feature selection methods in high dimensional classification problems. Using as benchmark the genomics domain, an empirical study has been conducted to compare some of the most popular feature selection methods, and useful insight has been obtained about their pattern of agreement.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 34, Issue 12, 1 September 2013, Pages 1446–1453
نویسندگان
, , ,