کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4946042 1439265 2017 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Online feature selection for high-dimensional class-imbalanced data
ترجمه فارسی عنوان
انتخاب ویژگی های آنلاین برای داده های با واسطه طبقه ای با ابعاد بزرگ
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
When tackling high dimensionality in data mining, online feature selection which deals with features flowing in one by one over time, presents more advantages than traditional feature selection methods. However, in real-world applications, such as fraud detection and medical diagnosis, the data is high-dimensional and highly class imbalanced, namely there are many more instances of some classes than others. In such cases of class imbalance, existing online feature selection algorithms usually ignore the small classes which can be important in these applications. It is hence a challenge to learn from high-dimensional and class imbalanced data in an online manner. Motivated by this, we first formalize the problem of online streaming feature selection for class imbalanced data, and then present an efficient online feature selection framework regarding the dependency between condition features and decision classes. Meanwhile, we propose a new algorithm of Online Feature Selection based on the Dependency in K nearest neighbors, called K-OFSD. In terms of Neighborhood Rough Set theory, K-OFSD uses the information of nearest neighbors to select relevant features which can get higher separability between the majority class and the minority class. Finally, experimental studies on seven high-dimensional and class imbalanced data sets show that our algorithm can achieve better performance than traditional feature selection methods with the same numbers of features and state-of-the-art online streaming feature selection algorithms in an online manner.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 136, 15 November 2017, Pages 187-199
نویسندگان
, , , ,