Article ID Journal Published Year Pages File Type
6873305 Future Generation Computer Systems 2018 21 Pages PDF
Abstract
Using a single commodity computational node to partition big graph is very difficult. This work studies how to partition a big graph with respect to arbitrary proportions in a streaming manner. To meet diverse requirements of big graph partitioning scenarios, we first devise 3 measurement schemes for measuring the graph vertex count, graph workload, and graph processing time, respectively. These schemes are the bases and prerequisites for big graph partitioning. Due to the difficulty in acquiring full big graph information, we then design 8 streaming heuristics to partitioning a big graph during the process of loading its data from external disks into memory. Each of these heuristics decides where to assign every vertex in the stream based on the information calculated by one of the above 3 schemes. At last, we demonstrate the performance and flexibility of our heuristics in partitioning real and synthetic graph datasets on a medium-sized cluster. The characteristics of arbitrary proportions of our approach makes it have a wide range of applications.
Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , , ,