کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4946056 1439266 2017 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Unsupervised feature selection based on the Morisita estimator of intrinsic dimension
ترجمه فارسی عنوان
انتخاب ویژگی های غیرقابل نگهداری بر اساس برآوردگر موریسیتا ابعاد ذاتی است
کلمات کلیدی
انتخاب ویژگی بدون نظارت، شاخص موریسیتا، بعد ذاتی، کمینه سازی افزونگی، داده کاوی،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
This paper deals with a new filter algorithm for selecting the smallest subset of features carrying all the information content of a dataset (i.e. for removing redundant features). It is an advanced version of the fractal dimension reduction technique, and it relies on the recently introduced Morisita estimator of Intrinsic Dimension (ID). Here, the ID is used to quantify dependencies between subsets of features, which allows the effective processing of highly non-linear data. The proposed algorithm is successfully tested on simulated and real world case studies. Different levels of sample size and noise are examined along with the variability of the results. In addition, a comprehensive procedure based on random forests shows that the data dimensionality is significantly reduced by the algorithm without loss of relevant information. And finally, comparisons with benchmark feature selection techniques demonstrate the promising performance of this new filter.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 135, 1 November 2017, Pages 125-134
نویسندگان
, ,