Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6856652 | Information Sciences | 2018 | 38 Pages |
Abstract
Strong simulation is a state-of-the-art approximate scheme in graph pattern matching. This scheme always finds high-quality results compared to other schemes. However, as the Web and social networks are increasingly used in human lives, the scale of the data grows extremely large. As a result, such big graphs are often stored in the distributed environment, in order to be managed efficiently. Unfortunately, current distributed algorithm for strong simulation is not efficient and cannot be applied to real applications. In this paper, we propose efficient parallel algorithms for strong simulation in the distributed setting. The contribution includes (1) We convert the calculation of strong simulation to calculating a relative small set of partial results for each partition of pattern suitable for distributed system. (2) We develop a method to reduce data shipment and time complexity of local computation in the distributed setting. (3) We split the distributed calculation of strong simulation into an offline redistribution algorithm and an online matching algorithm. The major data shipment is involved in the offline algorithm, while the online algorithm is highly parallel and much more efficient than current algorithms. (4) By experiments on both real and synthetic data, we verify the efficiency of our distributed algorithms and the effectiveness of our scheme without large intermediate results.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Hongzhi Wang, Ning Li, Jianzhong Li, Hong Gao,