Article ID Journal Published Year Pages File Type
4954850 Computer Networks 2017 14 Pages PDF
Abstract

Cloud-based big data platforms are being widely adopted in industry, due to their advantages of facilitating the implementation of big data processing and enabling elastic service frameworks. With the widespread adoption of cloud-based MapReduce frameworks, a series of solutions have been proposed to improve the performance of big data services over cloud. The majority of the existing studies concentrate on optimizing the task scheduling or resource provisioning mechanisms, to improve the data processing rate or data transmission rate of the platform separately, without an overall consideration of both the performance factors. Moreover, these studies seldom consider the impact of virtual network topologies on the performance of the cloud-based MapReduce workflows. The purpose of this work is to optimize the topologies of virtual networks used in cloud-based MapReduce frameworks. We formulate both the data transmission and data processing overhead of a specific cloud-based big data application, describe the optimal deployment of virtual networks as an optimization problem and then design algorithms to solve this problem. Experimental results show that our topology optimization mechanism improves the overall performance of cloud-based big data applications effectively.

Related Topics
Physical Sciences and Engineering Computer Science Computer Networks and Communications
Authors
, , , ,