کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
414246 680858 2015 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Hierarchical Collective I/O Scheduling for High-Performance Computing
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Hierarchical Collective I/O Scheduling for High-Performance Computing
چکیده انگلیسی


• Hierarchical I/O scheduling for two phase collective I/O.
• In-depth cost analysis of collective I/O.
• A model to predict the shuffle cost.
• Implementation in MPI-IO and PVFS file systems.
• Concurrent applications' cost analysis and comparison.

The non-contiguous access pattern of many scientific applications results in a large number of I/O requests, which can seriously limit the data-access performance. Collective I/O has been widely used to address this issue. However, the performance of collective I/O could be dramatically degraded in today's high-performance computing systems due to the increasing shuffle cost caused by highly concurrent data accesses. This situation tends to be even worse as many applications become more and more data intensive. Previous research has primarily focused on optimizing I/O access cost in collective I/O but largely ignored the shuffle cost involved. Previous works assume that the lowest average response time leads to the best QoS and performance, while that is not always true for collective requests when considering the additional shuffle cost. In this study, we propose a new hierarchical I/O scheduling (HIO) algorithm to address the increasing shuffle cost in collective I/O. The fundamental idea is to schedule applications' I/O requests based on a shuffle cost analysis to achieve the optimal overall performance, instead of achieving optimal I/O accesses only. The algorithm is currently evaluated with the MPICH3 and PVFS2. Both theoretical analysis and experimental tests show that the proposed hierarchical I/O scheduling has a potential in addressing the degraded performance issue of collective I/O with highly concurrent accesses.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Big Data Research - Volume 2, Issue 3, September 2015, Pages 117–126
نویسندگان
, , ,