کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6935072 1449557 2018 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Exploring the interplay of resilience and energy consumption for a task-based partial differential equations preconditioner
ترجمه فارسی عنوان
بررسی رابطه بین انعطاف پذیری و مصرف انرژی برای یک معادله دیفرانسیل با استفاده از معادلات جزئی مبتنی بر کار
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی
We discuss algorithm-based resilience to silent data corruptions (SDCs) in a task-based domain-decomposition preconditioner for partial differential equations (PDEs). The algorithm exploits a reformulation of the PDE as a sampling problem, followed by a solution update through data manipulation that is resilient to SDCs. The implementation is based on a server-client model where all state information is held by the servers, while clients are designed solely as computational units. Scalability tests run up to ∼51K cores show a parallel efficiency greater than 90%. We use a 2D elliptic PDE and a fault model based on random single and double bit-flip to demonstrate the resilience of the application to synthetically injected SDC. We discuss two fault scenarios: one based on the corruption of all data of a target task, and the other involving the corruption of a single data point. We show that for our application, given the test problem considered, a four-fold increase in the number of faults only yields a 2% change in the overhead to overcome their presence, from 7% to 9%. We then discuss potential savings in energy consumption via dynamic voltage/frequency scaling, and its interplay with fault-rates, and application overhead.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 73, April 2018, Pages 16-27
نویسندگان
, , , , , , , ,