Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6872940 | Future Generation Computer Systems | 2018 | 26 Pages |
Abstract
As a storage efficient approach, erasure coding has been adopted by many large-scale cloud storage systems to protect data from server and datacenter failures. To erasure-coded storage systems, it is critical to encode newly written data blocks and generate parity blocks efficiently. Existing encoding approaches include Striping Encoding and Replicating Encoding. They either incur too high network traffic or seriously degrade the I/O performance. In this paper, we propose Incremental Encoding, a decentralized encoding framework for all linear erasure codes. To achieve the optimal write performance, Incremental Encoding forwards newly written data blocks to multiple servers in a pipelining manner. To reduce network traffic, Incremental Encoding combines newly written data blocks together incrementally at the same time when they flow through servers to generate parity blocks. Incremental Encoding also caches intermediate parity blocks into memory to further reduce disk I/O. We evaluate Incremental Encoding by theoretically analyzing the encoding overheads and conducting a series of experiments in both a single-datacenter environment and a cross-datacenters environment. Analysis and experiments show that Incremental Encoding can achieve a much better trade-off between network traffic and I/O performance. Specially, compared with Replicating Encoding, which has the optimal I/O performance, Incremental Encoding has nearly the same I/O performance with 44.5%-48.4% less encoding traffic. Compared with Striping Encoding, Incremental Encoding has up to 90% better write performance and up to 108% read performance with 56.25%-73.6% more encoding traffic.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Computational Theory and Mathematics
Authors
Fangliang Xu, Yijie Wang, Xingkong Ma,