Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
392363 | Information Sciences | 2014 | 17 Pages |
Abstract
In this paper, we propose a novel probabilistic correlation-based similarity measure. Rather than simply conducting the matching of tokens between two records, our similarity evaluation enriches the information of records by considering correlations of tokens. The probabilistic correlation between tokens is defined as the probability of them appearing together in the same records. Then we compute weights of tokens and discover correlations of records based on the probabilistic correlations of tokens. The extensive experimental results demonstrate the effectiveness of our proposed approach.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Shaoxu Song, Han Zhu, Lei Chen,