کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
1181488 | 1491564 | 2012 | 5 صفحه PDF | دانلود رایگان |

A kernel version of k-nearest neighbor algorithm (k-NN) has been developed to model the complex relationship between molecular descriptors and bioactivities of compounds. Kernel k-NN is to perform the original k-NN algorithm by mapping the training samples in the input space into a high-dimensional feature space. It can be easily constructed by calculating the distance between samples in the feature space, directly deriving from the simple calculation of the kernel used. The developed kernel k-NN is very flexible to deal with complex nonlinear relationship, more importantly; it can also conveniently cope with some non-vectorial data only by the definition of different kernels. The results obtained from several real SAR datasets indicated that the performance of kernel k-NN is comparable to support vector machine methods. It can be regarded as an alternative modeling technique for several chemical problems including the study of structure–activity relationship (SAR). The source codes implementing kernel k-NN in R language are freely available at http://code.google.com/p/kernelmethods/.
► A kernel version of k-NN has been developed.
► The performance of kernel k-NN is competitive to one by SVM.
► Kernel k-NN can cope with non-vectorial data such as string data etc.
► Weighted kernel k-NN was developed to allow the construction of ROC.
Journal: Chemometrics and Intelligent Laboratory Systems - Volume 114, 15 May 2012, Pages 19–23