کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
461236 696578 2016 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Segmenting large traces of inter-process communication with a focus on high performance computing systems
ترجمه فارسی عنوان
جداسازی بزرگی از ارتباطات درون فرایند با تمرکز بر سیستم های محاسباتی با کارایی بالا
کلمات کلیدی
تجزیه و تحلیل پویا، ردیابی انتزاع و تجزیه و تحلیل، ردیابی ارتباطات بین فرآیند، سیستم های محاسباتی با کارایی بالا، تعمیر و نگهداری نرم افزار، درک برنامه
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
چکیده انگلیسی


• An approach for segmenting traces of HPC systems is proposed.
• The approach fosters the automatic detection of communication patterns.
• The segmentation mechanism relies on a technique used for segmenting DNA sequences.
• The segmentation process is applied to traces of hundreds of millions of events.

The understanding of the interactions among processes of a High Performance Computing (HPC) system can be made easier if trace analysis is used. Traces, however, can be quite large, making it difficult to analyze their content unless some abstraction is provided. This paper presents a novel trace abstraction approach that aims to facilitate the analysis of large execution traces generated from HPC applications. Our approach allows automatic segmentation of large traces into smaller and meaningful clusters that reflect the various execution phases of the traced scenarios. Our approach is based on the application of information theory principles to the analysis of sequences of communication patterns extracted from traces of HPC systems. This work is inspired by recent studies in the field of bioinformatics where several techniques have been proposed to segment DNA sequences into homogeneous sub-domains, where each sub-domain exhibits a certain degree of internal homogeneity. Trace segments can be used in a number of applications such as recovering high-level views of the system behavior and program understanding. We demonstrate the usefulness of our approach by applying it to different traces of hundreds of millions of events, generated from two HPC systems.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 120, October 2016, Pages 1–16
نویسندگان
, , ,