Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
1180734 | Chemometrics and Intelligent Laboratory Systems | 2014 | 8 Pages |
•We combine “repeated double cross validation” (rdCV) with KNN classification.•rdCV estimates the optimum no. of neighbors in KNN, independent from evaluation.•rdCV gives cautious estimations of predictive abilities (and their variabilities).•We apply KNN-rdCV to classify the origin of Italian olive oils.•We apply KNN-rdCV to classify minerals relevant in comet dust particles (ROSETTA).
Repeated double cross validation (rdCV) has recently been suggested as a careful and conservative strategy for optimizing and evaluating empirical multivariate calibration models. This evaluation strategy is adapted in this work for k-nearest neighbor (KNN) classification. The basics of rdCV are described, including the search for an optimum k, and tests with Italian Olive Oil Data. KNN-rdCV is applied to classify 17 mineral groups, relevant for the composition of comet dust particles, characterized by the peak heights at 20 selected masses in time-of-flight secondary ion mass spectra (TOF-SIMS). Predictive abilities for 15 mineral classes are > 95%, for two classes 75 and 85%.