کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
533709 870157 2008 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
On clustering tree structured data with categorical nature
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
On clustering tree structured data with categorical nature
چکیده انگلیسی

Clustering consists in partitioning a set of objects into disjoint and homogeneous clusters. For many years, clustering methods have been applied in a wide variety of disciplines and they also have been utilized in many scientific areas. Traditionally, clustering methods deal with numerical data, i.e. objects represented by a conjunction of numerical attribute values. However, nowadays commercial or scientific databases usually contain categorical data, i.e. objects represented by categorical attributes. In this paper we present a dissimilarity measure which is capable to deal with tree structured categorical data. Thus, it can be used for extending the various versions of the very popular k-means clustering algorithm to deal with such data. We discuss how such an extension can be achieved. Moreover, we empirically prove that the proposed dissimilarity measure is accurate, compared to other well-known (dis)similarity measures for categorical data.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 41, Issue 12, December 2008, Pages 3613–3623
نویسندگان
, ,