کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
458670 696179 2012 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Noisy data elimination using mutual k-nearest neighbor for classification mining
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Noisy data elimination using mutual k-nearest neighbor for classification mining
چکیده انگلیسی

k nearest neighbor (kNN) is an effective and powerful lazy learning algorithm, notwithstanding its easy-to-implement. However, its performance heavily relies on the quality of training data. Due to many complex real-applications, noises coming from various possible sources are often prevalent in large scale databases. How to eliminate anomalies and improve the quality of data is still a challenge. To alleviate this problem, in this paper we propose a new anomaly removal and learning algorithm under the framework of kNN. The primary characteristic of our method is that the evidence of removing anomalies and predicting class labels of unseen instances is mutual nearest neighbors, rather than k nearest neighbors. The advantage is that pseudo nearest neighbors can be identified and will not be taken into account during the prediction process. Consequently, the final learning result is more creditable. An extensive comparative experimental analysis carried out on UCI datasets provided empirical evidence of the effectiveness of the proposed method for enhancing the performance of the k-NN rule.


► A new lazy learning algorithm, named MkNNC, is designed for pattern classification. The MkNNC is an instance-based learning method. Its core idea is relatively intuitional and easy to implement. Meanwhile, it is more robust as encounter noises or inconsistent data.
► Anomalies will be firstly detected and removed from databases by the mutual nearest neighbors before constructing classification models. Consequently, the information of noise data will not be taken as determinant conditions during the learning process. Thus, the final prediction results are more creditable.
► The MkNNC involves classification learning and anomaly detection and elimination. Both of them are fulfilled with MNN, which carries more useful and reliable information than kNN in determining the relationship between instances.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 85, Issue 5, May 2012, Pages 1067–1074
نویسندگان
, ,