Article ID Journal Published Year Pages File Type
1180734 Chemometrics and Intelligent Laboratory Systems 2014 8 Pages PDF
Abstract

•We combine “repeated double cross validation” (rdCV) with KNN classification.•rdCV estimates the optimum no. of neighbors in KNN, independent from evaluation.•rdCV gives cautious estimations of predictive abilities (and their variabilities).•We apply KNN-rdCV to classify the origin of Italian olive oils.•We apply KNN-rdCV to classify minerals relevant in comet dust particles (ROSETTA).

Repeated double cross validation (rdCV) has recently been suggested as a careful and conservative strategy for optimizing and evaluating empirical multivariate calibration models. This evaluation strategy is adapted in this work for k-nearest neighbor (KNN) classification. The basics of rdCV are described, including the search for an optimum k, and tests with Italian Olive Oil Data. KNN-rdCV is applied to classify 17 mineral groups, relevant for the composition of comet dust particles, characterized by the peak heights at 20 selected masses in time-of-flight secondary ion mass spectra (TOF-SIMS). Predictive abilities for 15 mineral classes are > 95%, for two classes 75 and 85%.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , , , ,