کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4951649 | 1441481 | 2017 | 47 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
A portable and adaptable fault tolerance solution for heterogeneous applications
ترجمه فارسی عنوان
یک راه حل تحمل گسل قابل حمل و سازگار برای کاربردهای ناهمگن
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
نظریه محاسباتی و ریاضیات
چکیده انگلیسی
Heterogeneous systems have increased their popularity in recent years due to the high performance and reduced energy consumption capabilities provided by using devices such as GPUs or Xeon Phi accelerators. This paper proposes a checkpoint-based fault tolerance solution for heterogeneous applications, allowing them to survive fail-stop failures in the host CPU or in any of the accelerators used. Besides, applications can be restarted changing the host CPU and/or the accelerator device architecture, and adapting the computation to the number of devices available during recovery. The proposed solution is built combining CPPC (ComPiler for Portable Checkpointing), an application-level checkpointing tool, and HPL (Heterogeneous Programming Library), a library that facilitates the development of OpenCL-based applications. Experimental results show the low overhead introduced by the proposal and prove its portability and adaptability benefits.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Parallel and Distributed Computing - Volume 104, June 2017, Pages 146-158
Journal: Journal of Parallel and Distributed Computing - Volume 104, June 2017, Pages 146-158
نویسندگان
Nuria Losada, Basilio B. Fraguela, Patricia González, MarÃa J. MartÃn,