کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
7375477 1480070 2018 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
PSPLPA: Probability and similarity based parallel label propagation algorithm on spark
موضوعات مرتبط
مهندسی و علوم پایه ریاضیات فیزیک ریاضی
پیش نمایش صفحه اول مقاله
PSPLPA: Probability and similarity based parallel label propagation algorithm on spark
چکیده انگلیسی
With the rapid growth of social network, the cost of computation is increasing. Many existing algorithms are not suitable for the large-scale data. Apache Spark is an open-source cluster computing framework that empowers us to solve the problem of community detection in a cluster of computer. In this paper, we propose a novel label propagation algorithm on Spark, called PSPLPA (Probability and similarity based Parallel label propagation algorithm). PSPLPA employs a new label updating strategy using probability in the label propagation procedure during each iteration. First, weight calculation, which is based on k-shell, is integrated into the label initialization process. Second, parallel propagation steps are comprehensively proposed to utilize label probability efficiently. Third, randomness in label updating is significantly reduced via automatic label selection and similarity computation. Experiments conducted on artificial and real social networks demonstrate that the proposed algorithm exhibits high scalability and high accuracy.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Physica A: Statistical Mechanics and its Applications - Volume 503, 1 August 2018, Pages 366-378
نویسندگان
, , , , , ,