Article ID Journal Published Year Pages File Type
458670 Journal of Systems and Software 2012 8 Pages PDF
Abstract

k nearest neighbor (kNN) is an effective and powerful lazy learning algorithm, notwithstanding its easy-to-implement. However, its performance heavily relies on the quality of training data. Due to many complex real-applications, noises coming from various possible sources are often prevalent in large scale databases. How to eliminate anomalies and improve the quality of data is still a challenge. To alleviate this problem, in this paper we propose a new anomaly removal and learning algorithm under the framework of kNN. The primary characteristic of our method is that the evidence of removing anomalies and predicting class labels of unseen instances is mutual nearest neighbors, rather than k nearest neighbors. The advantage is that pseudo nearest neighbors can be identified and will not be taken into account during the prediction process. Consequently, the final learning result is more creditable. An extensive comparative experimental analysis carried out on UCI datasets provided empirical evidence of the effectiveness of the proposed method for enhancing the performance of the k-NN rule.

► A new lazy learning algorithm, named MkNNC, is designed for pattern classification. The MkNNC is an instance-based learning method. Its core idea is relatively intuitional and easy to implement. Meanwhile, it is more robust as encounter noises or inconsistent data. ► Anomalies will be firstly detected and removed from databases by the mutual nearest neighbors before constructing classification models. Consequently, the information of noise data will not be taken as determinant conditions during the learning process. Thus, the final prediction results are more creditable. ► The MkNNC involves classification learning and anomaly detection and elimination. Both of them are fulfilled with MNN, which carries more useful and reliable information than kNN in determining the relationship between instances.

Related Topics
Physical Sciences and Engineering Computer Science Computer Networks and Communications
Authors
, ,