کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
973755 | 1480127 | 2016 | 13 صفحه PDF | دانلود رایگان |

• We studied on the detection of overlapping communities, hubs and outliers.
• When the number of communities KK is known, we proposed a normalized symmetric NMF.
• But when KK is unknown, we proposed the Bayesian symmetric NMF.
• We used the Kullback–Leibler divergence for these algorithms.
For its crucial importance in the study of large-scale networks, many researchers devote to the detection of communities in various networks. It is now widely agreed that the communities usually overlap with each other. In some communities, there exist members that play a special role as hubs (also known as leaders), whose importance merits special attention. Moreover, it is also observed that some members of the network do not belong to any communities in a convincing way, and hence recognized as outliers. Failure to detect and exclude outliers will distort, sometimes significantly, the outcome of the detected communities. In short, it is preferable for a community detection method to detect all three structures altogether. This becomes even more interesting and also more challenging when we take the unsupervised assumption, that is, we do not assume the prior knowledge of the number KK of communities. Our approach here is to define a novel generative model and formalize the detection of overlapping communities as well as hubs and outliers as an optimization problem on it. When KK is given, we propose a normalized symmetric nonnegative matrix factorization algorithm based on Kullback–Leibler (KL) divergence to learn the parameters of the model. Otherwise, by combining KL divergence and prior model on parameters, we introduce another parameter learning method based on Bayesian symmetric nonnegative matrix factorization to learn the parameters of the model, while determining KK. Therefore, we present a community detection method arguably in the most general sense, which detects all three structures altogether without prior knowledge of the number of communities. Finally, we test the proposed method on various real-world networks. The experimental results, in contrast to several state-of-art algorithms, indicate its superior performance over other ones in terms of both clustering accuracy and community quality.
Journal: Physica A: Statistical Mechanics and its Applications - Volume 446, 15 March 2016, Pages 22–34