کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
474147 698846 2008 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Categorical data fuzzy clustering: An analysis of local search heuristics
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
Categorical data fuzzy clustering: An analysis of local search heuristics
چکیده انگلیسی

The fuzzy c partition of a set of qualitative data is the problem of selecting the optimal c   centroids that are the most representative of the whole population. Moreover, a set of weights wijwij must be determined, describing the fuzzy membership function of pattern i to the cluster represented by centroid j. Both problems are formulated by a single mathematical programming problem, that is an extension of the classic p-median models often used for clustering. The new objective function is neither concave nor convex and the application requires the clustering of many thousands of data, therefore heuristic methods are to be developed to find the best fuzzy partition.In this contribution, four methods are selected, that are implementations of meta-heuristics tested to solve p-median problems. Here, they are implemented and tested to find the optimal fuzzy c-partition. All heuristics implement neighborhood search with different strategies of visiting neighboring solutions: they are random restart method (RR), that is used in many commercial softwares and suggested in textbooks, tabu search (TS) that tries to find the best move to escape from a local optimum, variable neighborhood search (VNS), that explores quickly the solution space, candidate list search (CLS), that explores only interesting starting solutions.It is found that there is not a clear best method, but their performance depends on some parameter. TS is usually accurate, but time consuming. When c is small, VNS can be a reliable alternative, while, when c is large and there are many data to cluster, CLS provides good results. We point out that the simple RR method, that is sometimes used in commercial codes is of very poor quality: the implementation of one of the neighbor search algorithms leads to substantial improvements.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Operations Research - Volume 35, Issue 3, March 2008, Pages 766–775
نویسندگان
,