کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6863776 1439521 2018 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Learning category distance metric for data clustering
ترجمه فارسی عنوان
کلاس یادگیری فاصله متریک برای خوشه بندی داده ها
کلمات کلیدی
خوشه بندی داده ها، ویژگی طبقه بندی، فاصله متریک، آموزش از راه دور، وزن دسته،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
Unsupervised learning of adaptive distance metrics for categorical data is currently a challenge due to the difficulties in defining an inherently meaningful measure parameterizing the heterogeneity within matched or mismatched categorical symbols. In this paper, a new distance metric called category distance and a non-center-based algorithm are proposed for categorical data clustering. The new metric is formulated based on the category weights for each categorical attribute, no more depending on the common assumption that all categories on the same attribute are independent of each other. The problem of learning the category distance is therefore transformed into the new problem of learning a set of category weights, which can be jointly optimized with the clusters optimization. A case study on DNA sequences and experimental results on ten real-world data sets from different domains are given to demonstrate the performance of the proposed methods with comparisons to the existing distance measures for categorical data.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 306, 6 September 2018, Pages 160-170
نویسندگان
, ,