کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
393597 665658 2014 23 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Incremental learning based multiobjective fuzzy clustering for categorical data
ترجمه فارسی عنوان
خوشه بندی فازی چند منظوره مبتنی بر یادگیری اضافه شده برای داده های طبقه بندی شده
کلمات کلیدی
داده های طبقه بندی شده خوشه بندی فازی، تکامل تکامل چند جانبه، جنگل تصادفی آزمون آماری
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• Categorical data clustering.
• Development of new multiobjective algorithm for clustering categorical data.
• Use of random forest in incremental learning based multiobjective fuzzy clustering.
• New technique for clustering to evolve the best solution from the set of Pareto-optimum solutions.

The problem of clustering categorical data, where attribute values cannot be naturally ordered as numerical values, has gained more importance in recent time. Due to the special properties of categorical attributes, the clustering of categorical data seems to be more complicated than that of numerical data. Although, a few clustering algorithms that optimize single clustering objective have been proposed. It has found that such single measure may not be appropriate for all kind of datasets. Hence, in this article, an Incremental Learning based Multiobjective Fuzzy Clustering for Categorical Data is proposed. For this purpose, a multiobjective modified differential evolution based fuzzy clustering algorithm is developed. Thereafter, it integrates with the well-known supervised classifier, called random forest, using incremental learning to propose the aforementioned technique. Here, the multiobjective algorithm produces a set of optimal clustering solutions, known as Pareto-optimal solutions, by optimizing two conflicting objectives simultaneously. Subsequently, through incremental learning using random forest classifier final solution is evolved from the ensemble Pareto-optimal solutions. The results of the proposed method are demonstrated quantitatively and visually in comparison with widely used state-of-the-art methods for six synthetic and four real life datasets. Finally, statistical test is conducted to show the superiority of the results produced by the proposed method.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 267, 20 May 2014, Pages 35–57
نویسندگان
, ,