کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
10321797 660751 2015 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Efficient link-based similarity search in web networks
ترجمه فارسی عنوان
جستجوی شبکهای مبتنی بر شباهت در شبکههای وب
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
Similarity search in web networks, aiming to find entities similar to the given entity, is one of the core tasks in network analysis. With the proliferation of web applications, including web search and recommendation system, SimRank has been a well-known measure for evaluating entity similarity in a network. However, the existing work computes SimRank iteratively over a huge similarity matrix, which is expensive in terms of time and space cost and cannot efficiently support similarity search over large networks. In this paper, we propose a link-based similarity search method, WebSim, towards efficiently finding similar entities in web networks. WebSim defines the similarity between entities as the 2-hop similarity of SimRank. To reduce computation cost, we divide the similarity search process into two stages: off-line stage and on-line stage. In the off-line stage, the 1-hop similarities are computed, and an optimized algorithm is designed to reduce the unnecessary accumulation operations on zero similarities. In the on-line stage, the 2-hop similarities are computed, and a pruning algorithm is developed to support fast query processing through searching similar entries from a partial sums index derived from the 1-hop similarities. The index items that are lower than a given threshold are skipped to reduce the searching space. Compared to the iterative SimRank computation, the time and space cost of similarity computation is significantly reduced, since WebSim maintains only the similarity matrix of 1-hop that is much smaller than that of multi-hop. Experiments through comparison with SimRank and its optimized algorithms demonstrate that WebSim has on average a 99.83% reduction in the time cost and a 92.12% reduction in the space cost of similarity computation, and achieves on average 99.98% NDCG.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 42, Issue 22, 1 December 2015, Pages 8868-8880
نویسندگان
, , , , ,