کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4942744 1437416 2017 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Occam's razor in dimension reduction: Using reduced row Echelon form for finding linear independent features in high dimensional microarray datasets
ترجمه فارسی عنوان
تیغ زدن اککام در کاهش ابعاد: با استفاده از فرم کوچک رشته ای برای یافتن خواص خطی مستقل در مجموعه های داده های ریزماهواره ابعادی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
Microarray high dimensional datasets suffer from small sample size and extreme large number of features. Therefore, feature selection plays crucial roles on the performance of the trained models on those datasets. A typical feature selection method consists of two main parts, problem criterion and a search strategy. The common datasets don't have huge number of features with respect to their number of samples; hence, a search strategy in their feature selection methods were able to seek the search space. In contrast, microarray high dimensional datasets have huge number of features; therefore, their search space is very large and searching that space is a prohibitive action. In this paper, we take into account the philosophy of Occam's razor in feature subset selection in order to release high dimensional datasets from computational search methods. The proposed method uses two stages for feature selection. In the first stage features are rearranged by their importance in the dataset and in the second stage, the fundamental concept of reduced row Echelon form is applied on dataset in order to find linear independent features. For determining the effectiveness of the proposed method some experiments are carried out on nine binary microarray high dimensional datasets. The obtained results are compared with eleven state-of-the-art feature selection algorithms including Correlation based Feature Selection (CFS), Fast Correlation Based Filter (FCBF), Interact (INT) and Maximum Relevancy Minimum Redundancy (MRMR). The average outcomes of the results are analyzed by a statistical non-parametric test and it reveals that the proposed method has a meaningful superiority to the others in terms of accuracy, sensitivity, specificity, G-mean, number of selected features and computational complexity.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Engineering Applications of Artificial Intelligence - Volume 62, June 2017, Pages 214-221
نویسندگان
, , , ,