کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1180800 1491570 2006 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Rejecting unclassifiable samples with decision forests
موضوعات مرتبط
مهندسی و علوم پایه شیمی شیمی آنالیزی یا شیمی تجزیه
پیش نمایش صفحه اول مقاله
Rejecting unclassifiable samples with decision forests
چکیده انگلیسی

Validation of empirical models is designed to produce statistics related to the average error rate of the model. These statistics can be used to minimize errors arising from extrapolation in the Y-values, but pay no attention to the X-block of predicted samples and cannot provide sample specific prediction confidences. In this manuscript, a novel method for identifying potentially poorly classified samples is described that is universal to any Decision Forest method. The samples identified as unclassifiable are assigned a “no-class” assignment and it is shown that these samples have a much higher error rate than samples assigned to a class. These samples are identified by creating a proximity matrix that calculates the similarity of each test sample to each training sample. This similarity is defined in terms of the path samples took through the tree and can be used as a transformed descriptor set for a k-nearest neighbor classifier. The Decision Forest prediction and the k-nearest neighbor prediction can then be combined to assign the sample prediction in such a way that the expected error of the prediction is more accurate. The method is purely automatic and does not require any parameters beyond the determination of k.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Chemometrics and Intelligent Laboratory Systems - Volume 84, Issues 1–2, 1 December 2006, Pages 40–45
نویسندگان
, ,