Article ID Journal Published Year Pages File Type
4960533 Procedia Computer Science 2017 10 Pages PDF
Abstract

The proposed modification of conventional fuzzy C-means clustering (FCM) algorithm aims to correct some of its shortcomings. We have focused on as missing flexibility in cluster number adaptation; limited cluster type grouping; less than optimal objective function for clusters of unequal size lying very close to each other; considerable computational time particularly in case of high dimensional data. With M&MFCM we propose to replace the usual Euclidean distance with Mahalanobis and Minkowski metrics in order to enhance the cluster detection capacity of FCM by allowing more accurate detection of arbitrary shapes of clusters for high dimensional datasets. Direct replacement of Euclidean distance in the objective function of FCM with Mahalanobis might cause numerical problems as the largest eigenvalues of the fuzzy covariance matrix could produce extremely long clusters thus contradicting the real data distribution. The improvement is achieved by fixing the ratio between the maximal and minimal eigenvalues of the covariance matrix. The parameterized Minkowski distance metric is adapted for implementation with FCM with various settings. We also propose an approach for improving the initial choice of cluster number and for visualization and analysis of cluster results for labeled and unlabeled datasets. Experimental results demonstrate that the proposed M&MFCM and test methodology significantly improve FCM clustering results.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
, , ,