کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
431471 | 688555 | 2014 | 10 صفحه PDF | دانلود رایگان |

• Analyzed the performance degradation of HDFS caused by interaction-intensive tasks.
• Designed a two-layer structure to improve the performance of handling I/O request.
• Integrated caches to reduce the overhead of accessing interaction-intensive files.
• Developed a PSO-based storage allocation algorithm to improve the I/O throughput.
• Designed a set of experiments to evaluate the performance of the proposed methods.
The Hadoop Distributed File System (HDFS) is designed to run on commodity hardware and can be used as a stand-alone general purpose distributed file system (Hdfs user guide, 2008). It provides the ability to access bulk data with high I/O throughput. As a result, this system is suitable for applications that have large I/O data sets. However, the performance of HDFS decreases dramatically when handling the operations of interaction-intensive files, i.e., files that have relatively small size but are frequently accessed. The paper analyzes the cause of throughput degradation issue when accessing interaction-intensive files and presents an enhanced HDFS architecture along with an associated storage allocation algorithm that overcomes the performance degradation problem. Experiments have shown that with the proposed architecture together with the associated storage allocation algorithm, the HDFS throughput for interaction-intensive files increases 300% on average with only a negligible performance decrease for large data set tasks.
Journal: Journal of Parallel and Distributed Computing - Volume 74, Issue 8, August 2014, Pages 2770–2779