کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4970036 1450022 2017 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A novel density peaks clustering algorithm for mixed data
ترجمه فارسی عنوان
یک الگوریتم خوشه بندی برای قارچهای جدید برای داده های مخلوط
کلمات کلیدی
خوشه بندی داده ها، قله تراکم، آنتروپی، داده های مختلط،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی
The density peaks clustering (DPC) algorithm is well known for its power on non-spherical distribution data sets. However, it works only on numerical values. This prohibits it from being used to cluster real world data containing categorical values and numerical values. Traditional clustering algorithms for mixed data use a pre-processing based on binary encoding. But such methods destruct the original structure of categorical attributes. Other solutions based on simple matching, such as K-Prototypes, need a user-defined parameter to avoid favoring either type of attribute. In order to overcome these problems, we present a novel clustering algorithm for mixed data, called DPC-MD. We improve DPC by using a new similarity criterion to deal with the three types of data: numerical, categorical, or mixed data. Compared to other methods for mixed data, DPC absolutely has more advantages to deal with non-spherical distribution data. In addition, the core of the proposed method is based on a new similarity measure for mixed data. This similarity measure is proposed to avoid feature transformation and parameter adjustment. The performance of our method is demonstrated by experiments on some real-world datasets in comparison with that of traditional clustering algorithms, such as K-Modes, K-Prototypes EKP and SBAC.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 97, 1 October 2017, Pages 46-53
نویسندگان
, , ,