کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
523846 868506 2013 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Parallelizing heavyweight debugging tools with mpiecho
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Parallelizing heavyweight debugging tools with mpiecho
چکیده انگلیسی

Idioms created for debugging execution on single processors and multicore systems have been successfully scaled to thousands of processors, but there is little hope that this class of techniques can continue to be scaled out to tens of millions of cores. In order to allow development of more scalable debugging idioms we introduce mpiecho, a novel runtime platform that enables cloning of MPI ranks. Given identical execution on each clone, we then show how heavyweight debugging approaches can be parallelized, reducing their overhead to a fraction of the serialized case. We also show how this platform can be useful in isolating the source of hardware-based nondeterministic behavior and provide a case study based on a recent processor bug at LLNL.While total overhead will depend on the individual tool, we show that the platform itself contributes little: 512x tool parallelization incurs at worst 2x overhead across the NAS Parallel benchmarks, hardware fault isolation contributes at worst an additional 44% overhead. Finally, we show how mpiecho can lead to near-linear reduction in overhead when combined with maid, a heavyweight memory tracking tool provided with Intel’s pin platform. We demonstrate overhead reduction from 1466% to 53% and from 740% to 14% for cg (class D, 64 processes) and lu (class D, 64 processes), respectively, using only an additional 64 cores.


► Heavyweight debugging tools have become a staple of modern best programming practices.
► However, their overhead precludes their use where machine time is expensive, particularly in high-performance computing.
► Our contribution is demonstrating that this overhead is susceptible to parallelization.
► We introduce a runtime system, mpiecho, that allows cloning of arbitrary MPI nodes.
► Given identical execution on multiple nodes we then show how heavyweight debugging can be parallelized, reducing total overhead to a level compatible with HPC.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 39, Issue 3, March 2013, Pages 156–166
نویسندگان
, , , , , , ,