کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
13431233 1842474 2020 23 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Optimizing and accelerating space-time Ripley 's K function based on Apache Spark for distributed spatiotemporal point pattern analysis
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Optimizing and accelerating space-time Ripley 's K function based on Apache Spark for distributed spatiotemporal point pattern analysis
چکیده انگلیسی
With increasing point of interest (POI) datasets available with fine-grained spatial and temporal attributes, space-time Ripley's K function has been regarded as a powerful approach to analyze spatiotemporal point process. However, space-time Ripley's K function is computationally intensive for point-wise distance comparisons, edge correction and simulations for significance testing. Parallel computing technologies like OpenMP, MPI and CUDA have been leveraged to accelerate the K function, and related experiments have demonstrated the substantial acceleration. Nevertheless, previous works have not extended optimization of Ripley's K function from space dimension to space-time dimension. Without sophisticated spatiotemporal query and partitioning mechanisms, extra computational overhead can be problematic. Meanwhile, these researches were limited by the restricted scalability and relative expensive programming cost of parallel frameworks and impeded their applications for large POI dataset and Ripley's K function variations. This paper presents a distributed computing method to accelerate space-time Ripley's K function upon state-of-the-art distributed computing framework Apache Spark, and four strategies are adopted to simplify calculation procedures and accelerate distributed computing respectively: (1) spatiotemporal index based on R-tree is utilized to retrieve potential spatiotemporally neighboring points with less distance comparison; (2) spatiotemporal edge correction weights are reused by 2-tier cache to reduce repetitive computation in L value estimation and simulations; (3) spatiotemporal partitioning using KDB-tree is adopted to decrease ghost buffer redundancy in partitions and support near-balanced distributed processing; (4) customized serialization with compact representations of spatiotemporal objects and indexes is developed to lower the cost of data transmission. Based on the optimized method, a web-based visual analytics framework prototype has been developed. Experiments prove the feasibility and time efficiency of the proposed method, and also demonstrate its value on promoting applications of space-time Ripley's K function in ecology, geography, sociology, economics, urban transportation and other fields.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 105, April 2020, Pages 96-118
نویسندگان
, , , , , ,