Article ID Journal Published Year Pages File Type
6874367 Journal of Computational Science 2018 18 Pages PDF
Abstract
Hybrid storage satisfies the requirements of real-time processing and large capacity put forward by big data. But the widely used big data storage platform HDFS still cannot efficiently utilize emerging devices in hybrid environment. And most of existing hybrid storage researches also fail to consider the asymmetric characteristics among devices and data. This paper proposes a preference model to quantitatively weight the storage performance imbalance when data are distributed on different devices, and then distributes data on storage device whose performance efficiently matches data access characteristics. The implemented Preference-Aware HDFS (PAHDFS) shows high performance, efficiency and scalability in experiments.
Keywords
Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , , ,