Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6874367 | Journal of Computational Science | 2018 | 18 Pages |
Abstract
Hybrid storage satisfies the requirements of real-time processing and large capacity put forward by big data. But the widely used big data storage platform HDFS still cannot efficiently utilize emerging devices in hybrid environment. And most of existing hybrid storage researches also fail to consider the asymmetric characteristics among devices and data. This paper proposes a preference model to quantitatively weight the storage performance imbalance when data are distributed on different devices, and then distributes data on storage device whose performance efficiently matches data access characteristics. The implemented Preference-Aware HDFS (PAHDFS) shows high performance, efficiency and scalability in experiments.
Related Topics
Physical Sciences and Engineering
Computer Science
Computational Theory and Mathematics
Authors
Wei Zhou, Dan Feng, Zhipeng Tan, Yingfei Zheng,