کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
397272 671023 2011 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A bounded distance metric for comparing tree structure
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
A bounded distance metric for comparing tree structure
چکیده انگلیسی

Comparing tree-structured data for structural similarity is a recurring theme and one on which much effort has been spent. Most approaches so far are grounded, implicitly or explicitly, in algorithmic information theory, being approximations to an information distance derived from Kolmogorov complexity. In this paper we propose a novel complexity metric, also grounded in information theory, but calculated via Shannon's entropy equations. This is used to formulate a directly and efficiently computable metric for the structural difference between unordered trees. The paper explains the derivation of the metric in terms of information theory, and proves the essential property that it is a distance metric. The property of boundedness means that the metric can be used in contexts such as clustering, where second-order comparisons are required. The distance metric property means that the metric can be used in the context of similarity search and metric spaces in general, allowing trees to be indexed and stored within this domain. We are not aware of any other tree similarity metric with these properties.

Research highlights
► The paper defines a tree similarity metric and its basis in information theory.
► The metric is naturally and intuitively bounded within the range [0,1].
► Distance depends only on the degree of common structure, and is magnitude-tolerant.
► The metric is computable and tractable.
► It is a true distance metric, with proof of the required properties given.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Systems - Volume 36, Issue 4, June 2011, Pages 748–764
نویسندگان
, , , ,