Article ID Journal Published Year Pages File Type
392363 Information Sciences 2014 17 Pages PDF
Abstract
In this paper, we propose a novel probabilistic correlation-based similarity measure. Rather than simply conducting the matching of tokens between two records, our similarity evaluation enriches the information of records by considering correlations of tokens. The probabilistic correlation between tokens is defined as the probability of them appearing together in the same records. Then we compute weights of tokens and discover correlations of records based on the probabilistic correlations of tokens. The extensive experimental results demonstrate the effectiveness of our proposed approach.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,