کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4961332 | 1446514 | 2016 | 9 صفحه PDF | دانلود رایگان |
![عکس صفحه اول مقاله: Data Preloading and Data Placement for MapReduce Performance Improving Data Preloading and Data Placement for MapReduce Performance Improving](/preview/png/4961332.png)
The usage of Hadoop cluster is widely spread in different business and academic spheres. The performance of Hadoop depends on various factors, such as amount and frequency of CPU cores, RAM capacity, throughput of storages, dataflow's intensity, network bandwidth and latency, etc. The heterogeneity of a computing environment raises such problems as the optimization of data distribution across computing and storage resources of Hadoop cluster. In this paper, we propose an approach for the improvement of data placement and suggest an implementation of presented algorithm in Hadoop platform. Proposed method uses HDFS distributed cache to enhance a performance of task's execution. As a result, the introduced algorithm leads to the reduction of overall MapReduce tasks' execution time and increasing of I/O rates during the map stage.
Journal: Procedia Computer Science - Volume 101, 2016, Pages 379-387