کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
455727 695540 2013 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A new parallel recomputing code design methodology for fast failure recovery
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
A new parallel recomputing code design methodology for fast failure recovery
چکیده انگلیسی

As the size of large-scale computer systems increases, their mean-time-between-failures are becoming significantly shorter than the execution time of many current scientific applications. Fault-tolerant parallel algorithm (FTPA) is an application-level fault-tolerant approach that can achieve fast self-recovery by parallel recomputing. The method of parallelizing the loops has been used to design the parallel recomputing code for FTPA in our prior work.In the present paper, we first propose a new parallel recomputing code design methodology. Second, the parallel recomputing code design methodology is automated by exploring the use of compiler technology. Finally, we evaluate the performance of our approach with five programs on Tianhe-1A. The experimental results show that the parallel recomputing code generated by the new method has a higher efficiency of parallel recomputing.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Electrical Engineering - Volume 39, Issue 4, May 2013, Pages 1095–1113
نویسندگان
, , ,