کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
529892 869719 2015 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Discrete optimal Bayesian classification with error-conditioned sequential sampling
ترجمه فارسی عنوان
طبقه بندی بیسیم بهینه با استفاده از روش نمونه گیری متوالی خطا
کلمات کلیدی
طبقه بندی مطلوب بیزی، نمونه برداری کنترل شده، دانش قبلی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی


• A sampling algorithm for training the optimal Bayesian classifier is introduced.
• The algorithm works based on minimization of the expected error on the uncertainty class of prior knowledge.
• Using a Zipf model we show that our sampling algorithm leads to a less true error on average than random sampling.
• Our algorithm shows robustness even in case when prior knowledge drifts away from true distributions.
• An example on data from p53 network shows that our method works well on from real pathway data as well.

When in possession of prior knowledge concerning the feature-label distribution, in particular, when it is known that the feature-label distribution belongs to an uncertainty class of distributions governed by a prior distribution, this prior knowledge can be used in conjunction with the training data to construct the optimal Bayesian classifier (OBC), whose performance is, on average, optimal among all classifiers relative to the posterior distribution derived from the prior distribution and the data. Typically in classification theory it is assumed that sampling is performed randomly in accordance with the prior probabilities on the classes and this has heretofore been true in the case of OBC. In the present paper we propose to forego random sampling and utilize the prior knowledge and previously collected data to determine which class to sample from at each step of the sampling. Specifically, we choose to sample from the class that leads to the smallest expected classification error with the addition of the new sample point. We demonstrate the superiority of the resulting nonrandom sampling procedure to random sampling on both synthetic data and data generated from known biological pathways.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 48, Issue 11, November 2015, Pages 3766–3782
نویسندگان
, , , ,