کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
847927 909234 2014 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Relative density-based classification noise detection
ترجمه فارسی عنوان
تشخیص صدا بر اساس طبقه بندی نسبی
موضوعات مرتبط
مهندسی و علوم پایه سایر رشته های مهندسی مهندسی (عمومی)
چکیده انگلیسی

Classification noise is a common byproduct of traditional data mining approaches, and no specialized approach for detecting classification noise is currently available. Methods for outlier detection are well-developed, but outliers and classification noise have characteristics different enough to make outlier detection algorithms unsuitable for classification noise detection. In this paper, a new, specialized approach to detect classification noise is proposed, named relative density based classification noise detection (RDBCND). Computational experiments in artificial data sets described herein show that RDBCND has time complexity of O(n log n), indicating greater efficiency than traditional approaches, which exhibit time complexity of at least O(n2). The use of classification noise detection to improve the generalization ability of common classifier algorithms is also described. In particular, a new unified approach based on RDBCND is compared to a cross validation approach applied to a BP neural network. Trials in both artificial and real-life datasets show that the RDBCND-based approach can greatly accelerate the process of identifying the best decision function. The novel method can also eliminate underfitting, as the algorithm simply searches for the highest training accuracy. The experiments also show that the RDBCND-based method has greater accuracy and lower cpu time in reaching global solutions than the cross-validation method. Since the relative density is a local concept, our new approach can be directly used in nonlinear datasets without data transformation. It is a great advantage compared to some linear classifier algorithms. As in current linear classifiers, the kernel functions or other transformations need to be used to make them suitable for non-linear datasets, and that will increase their complexity.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Optik - International Journal for Light and Electron Optics - Volume 125, Issue 22, November 2014, Pages 6829–6834
نویسندگان
, , , , , ,