کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
562234 | 1451943 | 2016 | 10 صفحه PDF | دانلود رایگان |
• The paper provides a framework for obtaining biased node samples by crawling the network under supervision.
• The paper introduces a new supervised sampling problem on networked data.
• The proposed method is to sample a sub network under a specific goal of, acquiring positive nodes.
Traditional graph sampling methods reduce the size of a large network via uniform sampling of nodes from the original network. The sampled network can be used to estimate the topological properties of the original network. However, in some application domains (e.g., disease surveillance), the goal of sampling is also to help identify a specified category of nodes (e.g., affected individuals) in a large network. This work therefore aims to, given a large information network, sample a subgraph under a specific goal of acquiring as many nodes with a particular category as possible. We refer to this problem as supervised sampling, where we sample a large network for a specific category of nodes. To this end, we model a network as a Markov chain and derive supervised random walks to learn stationary distributions of the sampled network. The learned stationary distribution can help identify the best node to be sampled in the next iteration. The iterative sampling process ensures that with new sampled nodes being acquired, supervised sampling can be strengthened in turn. Experiments on synthetic as well as real-world networks show that our supervised sampling algorithm outperforms existing methods in obtaining target nodes in the sampled networks.
Journal: Signal Processing - Volume 124, July 2016, Pages 93–102