کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
431705 688616 2009 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
ScalaTrace: Scalable compression and replay of communication traces for high-performance computing
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
ScalaTrace: Scalable compression and replay of communication traces for high-performance computing
چکیده انگلیسی

Characterizing the communication behavior of large-scale applications is a difficult and costly task due to code/system complexity and long execution times. While many tools to study this behavior have been developed, these approaches either aggregate information in a lossy way through high-level statistics or produce huge trace files that are hard to handle.We contribute an approach that provides orders of magnitude smaller, if not near-constant size, communication traces regardless of the number of nodes while preserving structural information. We introduce intra- and inter-node compression techniques of MPI events that are capable of extracting an application’s communication structure. We further present a replay mechanism for the traces generated by our approach and discuss results of our implementation for BlueGene/L. Given this novel capability, we discuss its impact on communication tuning and beyond. To the best of our knowledge, such a concise representation of MPI traces in a scalable manner combined with deterministic MPI call replay is without any precedent.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Parallel and Distributed Computing - Volume 69, Issue 8, August 2009, Pages 696–710
نویسندگان
, , , , ,