Article ID Journal Published Year Pages File Type
6857746 Information Sciences 2014 13 Pages PDF
Abstract
To improve the join quality, our approach considers both XML structure and node label similarity by applying two tailored similarity measures. Min-hash, a probabilistic hash function, is employed to achieve scalability. Extensive experiments confirm that the join quality is fundamentally improved when the label similarity is considered and our join efficiency is even higher than some of the most efficient methods.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , , ,