Article ID Journal Published Year Pages File Type
535030 Pattern Recognition Letters 2016 7 Pages PDF
Abstract

•kNN-FWPD classifier is proposed with FWPD as the underlying dissimilarity measure.•kNN-FWPD classifier can be directly applied to datasets having missing features.•The proposed classifier has similar time complexity compared to the kNN classifier.•Experiments are conducted on 4 types of missingness: MCAR, MAR, MNAR1, and MNAR2.•kNN-FWPD is found to outperform ZI, AI, and kNNI in terms of classification accuracy.

The k-Nearest Neighbor (kNN) classifier is an elegant learning algorithm widely used because of its simple and non-parametric nature. However, like most learning algorithms, kNN cannot be directly applied to data plagued by missing features. We make use of the philosophy of a Penalized Dissimilarity Measure (PDM) and incorporate a PDM called the Feature Weighted Penalty based Dissimilarity (FWPD) into kNN, forming the kNN-FWPD classifier which can be directly applied to datasets with missing features, without any preprocessing (like marginalization or imputation). Extensive experimentation on simulations of four different missing feature mechanisms (using various datasets) suggests that the proposed method can handle the missing feature problem much more effectively compared to some of the popular imputation mechanisms (used in conjunction with kNN).

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , ,