کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
396709 670557 2014 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Assessing single-pair similarity over graphs by aggregating first-meeting probabilities
ترجمه فارسی عنوان
ارزیابی شباهت یک جفت بیش از نمودار ها با جمع آوری احتمال احتمالی اولین جلسه
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• We propose an approach to compute similarity of a single node pair, avoiding computation of similarities of other node pairs.
• We prove that the method computes similarity without accuracy loss and its time cost is less than all-pair SimRank.
• We propose position probability and position matrix to facilitate the implementation of the proposed approach.
• Comprehensive experiments are conducted on both synthetic datasets and real datasets to demonstrate its accuracy and efficiency.

Link-based similarity plays an important role in measuring similarities between nodes in a graph. As a widely used link-based similarity, SimRank scores similarity between two nodes as the first-meeting probability of two random surfers. However, due to the large scale of graphs in real-world applications and dynamic change characteristic, it is not viable to frequently update the whole similarity matrix. Also, people often only concern about the similarities of a small subset of nodes in a graph. In such a case, the existing approaches need to compute the similarities of all node-pairs simultaneously, suffering from high computation cost.In this paper, we propose a new algorithm, Iterative Single-Pair SimRank (ISP), based on the random surfer-pair model to compute the SimRank similarity score for a single pair of nodes in a graph. To avoid computing similarities of all other nodes, we introduce a new data structure, position matrix, to facilitate computation of the first-meeting probabilities of two random surfers, and give two optimization techniques to further enhance their performance. In addition, we theoretically prove that the time cost of ISP is always less than the original algorithm SimRank. Comprehensive experiments conducted on both synthetic and real datasets demonstrate the effectiveness and efficiency of our approach.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Systems - Volume 42, June 2014, Pages 107–122
نویسندگان
, , , , , ,