Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
5103188 | Physica A: Statistical Mechanics and its Applications | 2017 | 20 Pages |
Abstract
In the past few years, the storage and the analysis of large-scale and fast evolving networks presents a great challenge. Therefore, a number of different techniques have been proposed for sampling large networks. Studies on network sampling primarily analyze the changes of network properties under the sampling. In general, network exploration techniques approximate the original networks more accurate than random node and link selection. Yet, link selection with additional subgraph induction step outperforms most other techniques. In this paper, we apply subgraph induction also to random walk and forest-fire sampling and evaluate the effects of subgraph induction on the sampling accuracy. We analyze different real-world networks and the changes of their properties introduced by sampling. The results reveal that the techniques with subgraph induction improve the performance of techniques without induction and create denser sample networks with larger average degree. Furthermore, the accuracy of sampling decrease consistently across various sampling techniques, when the sampled networks are smaller. Based on the results of the comparison, we introduce the scheme for selecting the most appropriate technique for network sampling. Overall, the breadth-first exploration sampling proves as the best performing technique.
Keywords
Related Topics
Physical Sciences and Engineering
Mathematics
Mathematical Physics
Authors
Neli Blagus, Lovro Å ubelj, Marko Bajec,