کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4969100 1449894 2018 41 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Classification with class noises through probabilistic sampling
ترجمه فارسی عنوان
طبقه بندی با صداهای کلاس از طریق نمونه برداری احتمالی
کلمات کلیدی
داده های آموزش نادرست، نمونه برداری یک صفر، نمونه برداری احتمالی، رأی چندگانه،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی
Accurately labeling training data plays a critical role in various supervised learning tasks. Now a wide range of algorithms have been developed to identify and remove mislabeled data as labeling in practical applications might be erroneous due to various reasons. In essence, these algorithms adopt the strategy of one-zero sampling (OSAM), wherein a sample will be selected and retained only if it is recognized as clean. There are two types of errors in OSAM: identifying a clean sample as mislabeled and discarding it, or identifying a mislabeled sample as clean and retaining it. These errors could lead to poor classification performance. To improve classification accuracy, this paper proposes a novel probabilistic sampling (PSAM) scheme. In PSAM, a cleaner sample has more chance to be selected. The degree of cleanliness is measured by the confidence on the label. To accurately estimate the confidence value, a probabilistic multiple voting idea is proposed which is able to assign a high confidence value to a clean sample and a low confidence value to a mislabeled sample. Finally, we demonstrate that PSAM could effectively improve the classification accuracy over existing OSAM methods.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Fusion - Volume 41, May 2018, Pages 57-67
نویسندگان
, , , ,