کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1168355 1491157 2009 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Bagged k-nearest neighbours classification with uncertainty in the variables
موضوعات مرتبط
مهندسی و علوم پایه شیمی شیمی آنالیزی یا شیمی تجزیه
پیش نمایش صفحه اول مقاله
Bagged k-nearest neighbours classification with uncertainty in the variables
چکیده انگلیسی

An analytical result should be expressed as x ± U, where x is the experimental result obtained for a given variable and U is its uncertainty. This uncertainty is rarely taken into account in supervised classification. In this paper, we propose to include the information about the uncertainty of the experimental results to compute the reliability of classification. The method combines k-nearest neighbours (kNN) with a nested bootstrap scheme, in which a new bootstrap training set is generated using the classical bootstrap in the first level (B times) and a new bootstrap method, called U-bootstrap, in the second level (D times). Two bootstraps are used to reduce the effect of sampling in the first level and the effect of the uncertainty in the second one. These B × D new training bootstrap sets are used to compute the reliability of classification for an unknown object using kNN. The object is classified into the class with the highest reliability. In this method, unlike the classical kNN and Probabilistic Bagged k-nearest neighbours (PBkNN), the reliability of classification changes (increases or decreases) when the uncertainty is increased. These changes depend on the position of the unknown object with respect to the training objects. For the benchmark Wine dataset, we found similar values of classification error rate (CER) than for kNN (5.57%), but lower than Probabilistic Bagged k-nearest neighbours using Hamamoto's bootstrap (7.96%) or Efron's bootstrap (8.97%).

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Analytica Chimica Acta - Volume 646, Issues 1–2, 30 July 2009, Pages 62–68
نویسندگان
, , ,