کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
402325 676906 2014 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Detecting potential labeling errors for bioinformatics by multiple voting
ترجمه فارسی عنوان
شناسایی خطاهای احتمالی خطا برای بیوانفورماتیک با رای گیری چندگانه
کلمات کلیدی
تجزیه و تحلیل بیوانفورماتیک، شناسایی داده های نامشخص، تنها رای دادن، رأی چندگانه، طبقه بندی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Classification techniques are important in bioinformatics analysis as they can separate various bioinformatical data into distinct groups. To obtain good classifiers, accurate labeling of the training data is required. However labeling in practical bioinformatics applications might be erroneous due to various reasons. To identify those mislabeled data, an ensemble learning based scheme, single-voting has been widely used. It generates multiple classifiers and makes use of their voting to detect mislabeled data. Single-voting scheme mainly consists of two components: data partitioning component to generate multiple classifiers, and mislabeled detection component to identify mislabeled data. Existing works in this field mainly focus on mislabeled detection part and neglect data partitioning. However, our analysis shows that data partitioning plays an important role in single-voting scheme. This analysis helps us proposing a novel multiple-voting scheme. It is superior to traditional single-voting by reducing the unreliable influence from data partitioning. Empirical and theoretical evaluations on a set of bioinformatics datasets illustrate the utility of our proposed scheme.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 66, August 2014, Pages 28–35
نویسندگان
, , , ,