کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
382707 660781 2015 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Top-k similarity search in heterogeneous information networks with x-star network schema
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Top-k similarity search in heterogeneous information networks with x-star network schema
چکیده انگلیسی


• The efficiency improvement is evident for similarity computation.
• The effectiveness of returned result is good for similarity search.
• The pruning algorithm is presented for supporting fast online query processing.
• The accuracy loss of pruning algorithm can be controlled by setting thresholds.

An x-star network is an information network which consists of centers with connections among themselves, and different type attributes linking to these centers. As x-star networks become ubiquitous, extracting knowledge from x-star networks has become an important task. Similarity search in x-star network aims to find the centers similar to a given query center, which has numerous applications including collaborative filtering, community mining and web search. Although existing methods yield promising similar results, such as SimRank and P-Rank, they are not applicable for massive x-star networks. In this paper, we propose a structural-based similarity measure, NetSim, towards efficiently computing similarity between centers in an x-star network. The similarity between attributes is computed in the pre-processing stage by the expected meeting probability over attribute network that is extracted from the whole structure of x-star network. The similarity between centers is computed online according to the attribute similarities based on the intuition that similar centers are linked with similar attributes. NetSim requires less time and space cost than existing methods since the scale of attribute network is significantly smaller than the whole x-star network. For supporting fast online query processing, we develop a pruning algorithm by building a pruning index, which prunes candidate centers that are not promising. Extensive experiments demonstrate the effectiveness and efficiency of our method through comparing with the state-of-the-art measures.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 42, Issue 2, 1 February 2015, Pages 699–712
نویسندگان
, , , ,