کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
381711 1437513 2006 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Using clustering to learn distance functions for supervised similarity assessment
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Using clustering to learn distance functions for supervised similarity assessment
چکیده انگلیسی

Assessing the similarity between objects is a prerequisite for many data mining techniques. This paper introduces a novel approach to learn distance functions that maximizes the clustering of objects belonging to the same class. Objects belonging to a data set are clustered with respect to a given distance function and the local class density information of each cluster is then used by a weight adjustment heuristic to modify the distance function so that the class density is increased in the attribute space. This process of interleaving clustering with distance function modification is repeated until a “good” distance function has been found. We implemented our approach using the kk-means clustering algorithm. We evaluated our approach using seven UCI data sets for a traditional 1-nearest-neighbor (1-NN) classifier and a compressed 1-NN classifier, called NCC, that uses the learnt distance function and cluster centroids instead of all the points of a training set. The experimental results show that attribute weighting leads to statistically significant improvements in prediction accuracy over a traditional 1-NN classifier for two of the seven data sets tested, whereas using NCC significantly improves the accuracy of the 1-NN classifier for four of the seven data sets.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Engineering Applications of Artificial Intelligence - Volume 19, Issue 4, June 2006, Pages 395–401
نویسندگان
, , , ,