کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6874875 1441461 2018 32 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Scalable data analytics using crowdsourced repositories and streams
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Scalable data analytics using crowdsourced repositories and streams
چکیده انگلیسی
The scalable analysis of crowdsourced data repositories and streams has quickly become a critical experimental asset in multiple fields. It enables the systematic aggregation of otherwise disperse data sources and their efficient processing using significant amounts of computational resources. However, the considerable amount of crowdsourced social data and the numerous criteria to observe can limit analytical off-line and on-line processing due to the intrinsic computational complexity. This paper demonstrates the efficient parallelisation of profiling and recommendation algorithms using tourism crowdsourced data repositories and streams. Using the Yelp data set for restaurants, we have explored two different profiling approaches: entity-based and feature-based using ratings, comments, and location. Concerning recommendation, we use a collaborative recommendation filter employing singular value decomposition with stochastic gradient descent (SVD-SGD). To accurately compute the final recommendations, we have applied post-recommendation filters based on venue suitability, value for money, and sentiment. Additionally, we have built a social graph for enrichment. Our master-worker implementation shows super-linear scalability for 10, 20, 30, 40, 50, and 60 concurrent instances.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Parallel and Distributed Computing - Volume 122, December 2018, Pages 1-10
نویسندگان
, , , , ,