دانلود رایگان مقاله: طبقه بندی مجموعه داده های نامتعادل با استفاده از تجزیه سلسله مراتبی مبتنی بر شباهت

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
529923	869724	2015	20 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Classifying imbalanced data sets using similarity based hierarchical decomposition

ترجمه فارسی عنوان

طبقه بندی مجموعه داده های نامتعادل با استفاده از تجزیه سلسله مراتبی مبتنی بر شباهت

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

مشکل عدم تعادل کلاس، تجزیه سلسله مراتبی، خوشه بندی کشف بیرونی، اقلیت ها - اکثریت کلاس ها

Hierarchical decomposition - تجزیه سلسله مراتبی Outlier detection - تشخیص داده پرت Clustering - خوشه بندی Class imbalance problem - مشکل عدم تعادل کلاس

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش مقاله

طبقه بندی مجموعه داده های نامتعادل با استفاده از تجزیه سلسله مراتبی مبتنی بر شباهت

چکیده انگلیسی

• A novel method for imbalanced dataset classification.
• A new hierarchical classifier which does not use a fixed feature/class hierarchy.
• Uses clustering and outlier detection to construct the hierarchy.
• Shows that different feature spaces can be used to build a hierarchy.
• Successful when the class imbalanced ratio is low, classes are highly overlapping.

Classification of data is difficult if the data is imbalanced and classes are overlapping. In recent years, more research has started to focus on classification of imbalanced data since real world data is often skewed. Traditional methods are more successful with classifying the class that has the most samples (majority class) compared to the other classes (minority classes). For the classification of imbalanced data sets, different methods are available, although each has some advantages and shortcomings. In this study, we propose a new hierarchical decomposition method for imbalanced data sets which is different from previously proposed solutions to the class imbalance problem. Additionally, it does not require any data pre-processing step as many other solutions need. The new method is based on clustering and outlier detection. The hierarchy is constructed using the similarity of labeled data subsets at each level of the hierarchy with different levels being built by different data and feature subsets. Clustering is used to partition the data while outlier detection is utilized to detect minority class samples. The comparison of the proposed method with state of art the methods using 20 public imbalanced data sets and 181 synthetic data sets showed that the proposed method׳s classification performance is better than the state of art methods. It is especially successful if the minority class is sparser than the majority class. It has accurate performance even when classes have sub-varieties and minority and majority classes are overlapping. Moreover, its performance is also good when the class imbalance ratio is low, i.e. classes are more imbalanced.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 48, Issue 5, May 2015, Pages 1653–1672

نویسندگان

Cigdem Beyan, Robert Fisher,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : طبقه بندی مجموعه داده های نامتعادل با استفاده از تجزیه سلسله مراتبی مبتنی بر شباهت

دسترسی سریع

ارتباط

English Website