کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4944455 | 1437994 | 2017 | 24 صفحه PDF | دانلود رایگان |

We introduce H-group closeness centrality in this work. H-group closeness centrality of a group of nodes measures how close this node group is to other nodes in a graph, and can be used in numerous applications such as measuring the importance and influence of a group of users in a social network. When a large graph contains billions of edges that cannot reside entirely in a computer's main memory, computing and maximizing H-group closeness centrality both become challenging. In this work, we present a systematic solution for efficiently computing and maximizing H-group closeness centrality in large disk-resident graphs, only using a common PC. Our solution leverages a probabilistic counting method to efficiently estimate H-group closeness with high accuracy, rather than exhaustively computing it in an exact fashion. Furthermore, we design an I/O-efficient greedy algorithm to find a node group that approximately maximizes H-group closeness centrality in a graph. Our algorithm exploits several appealing properties of the defined H-group closeness measure to reduce the computational cost of handling a disk-resident graph. Extensive evaluation on large real-world and synthetic graphs demonstrates the efficiency of our proposed method. For example, our proposed I/O-efficient greedy algorithm is about 300 times faster than a simple multi-pass method on the Twitter graph with 1.4 billion edges. This reduces the running time of identifying one group member from nearly an hour to less than 20Â s on average.
Journal: Information Sciences - Volumes 400â401, August 2017, Pages 105-128