کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
536426 870523 2013 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
treeKL: A distance between high dimension empirical distributions
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
treeKL: A distance between high dimension empirical distributions
چکیده انگلیسی

This paper offers a methodological contribution for computing the distance between two empirical distributions in an Euclidean space of very large dimension.We propose to use decision trees instead of relying on standard quantification of the feature space. Our contribution is twofold: We first define a new distance between empirical distributions, based on the Kullback–Leibler (KL) divergence between the distributions over the leaves of decision trees built for the two empirical distributions. Then, we propose a new procedure to build these unsupervised trees efficiently.The performance of this new metric is illustrated on image clustering and neuron classification. Results show that the tree-based method outperforms standard methods based on standard bag-of-features procedures.


► We propose a distance for computing distance between two sets of points.
► Distributions are modeled by using unsupervised trees.
► A distance is proposed between trees which have different size.
► Considering normal distributions, the proposed distance gives similar results than the theoretical Kullback–Leibler expression.
► Applied to classification of neurons, the distance outperforms the baseline, i.e. bag of features.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 34, Issue 2, 15 January 2013, Pages 140–145
نویسندگان
, ,