کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
385695 660869 2011 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A fuzzy c-means-type algorithm for clustering of data with mixed numeric and categorical attributes employing a probabilistic dissimilarity functional
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
A fuzzy c-means-type algorithm for clustering of data with mixed numeric and categorical attributes employing a probabilistic dissimilarity functional
چکیده انگلیسی

Gath–Geva (GG) algorithm is one of the most popular methodologies for fuzzy c-means (FCM)-type clustering of data comprising numeric attributes; it is based on the assumption of data deriving from clusters of Gaussian form, a much more flexible construction compared to the spherical clusters assumption of the original FCM. In this paper, we introduce an extension of the GG algorithm to allow for the effective handling of data with mixed numeric and categorical attributes. Traditionally, fuzzy clustering of such data is conducted by means of the fuzzy k-prototypes algorithm, which merely consists in the execution of the original FCM algorithm using a different dissimilarity functional, suitable for attributes with mixed numeric and categorical attributes. On the contrary, in this work we provide a novel FCM-type algorithm employing a fully probabilistic dissimilarity functional for handling data with mixed-type attributes. Our approach utilizes a fuzzy objective function regularized by Kullback–Leibler (KL) divergence information, and is formulated on the basis of a set of probabilistic assumptions regarding the form of the derived clusters. We evaluate the efficacy of the proposed approach using benchmark data, and we compare it with competing fuzzy and non-fuzzy clustering algorithms.

Research highlights
► A fuzzy clustering algorithm for mixed numeric and categorical attributes is proposed.
► The method employs a novel probabilistic dissimilarity functional.
► Applications include analysis of financial and medical data.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 38, Issue 7, July 2011, Pages 8684–8689
نویسندگان
,