A spatiotemporal compression based approach for efficient big data processing on Cloud

Article ID	Journal	Published Year	Pages	File Type
430689	Journal of Computer and System Sciences	2014	21 Pages	PDF

Abstract

•Spatial and temporal compression for big graph data storage and processing on Cloud.•Temporal compression for reducing data from a single node in a graph.•Spatial compression for reducing data from correlated nodes in a graph.•Significant time performance gains achieved by a novel scheduling on Cloud.•Trade off between data quality and processing efficiency being guaranteed.

It is well known that processing big graph data can be costly on Cloud. Processing big graph data introduces complex and multiple iterations that raise challenges such as parallel memory bottlenecks, deadlocks, and inefficiency. To tackle the challenges, we propose a novel technique for effectively processing big graph data on Cloud. Specifically, the big data will be compressed with its spatiotemporal features on Cloud. By exploring spatial data correlation, we partition a graph data set into clusters. In a cluster, the workload can be shared by the inference based on time series similarity. By exploiting temporal correlation, in each time series or a single graph edge, temporal data compression is conducted. A novel data driven scheduling is also developed for data processing optimisation. The experiment results demonstrate that the spatiotemporal compression and scheduling achieve significant performance gains in terms of data size and data fidelity loss.

Keywords

Scheduling Cloud computing Big Data