دانلود رایگان مقاله: پروفیل اجرا در سطح بلوک

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
523830	868503	2016	14 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

A dynamic block-level execution profiler

ترجمه فارسی عنوان

پروفیل اجرا در سطح بلوک

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

HPC Processor Design Computer architecture - معماری کامپیوتر

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش مقاله

چکیده انگلیسی

• We introduce a hardware-based mechanism to dynamically profile application blocks.
• Profiling information is used to prioritize critical memory loads during execution.
• Our mechanism yields better accuracy and performance gains than previous proposals.
• We extensively analyze how our mechanism improves performance.
• Results show that it alleviates prefetch inter-core interference.

Most performance enhancing mechanisms in current processors, such as branch predictors or prefetchers, rely on program characteristics monitored at the granularity of single instructions. However, many of these characteristics can be obtained at the basic block-level instead. The coarser granularity allows a larger portion of the code to be examined, enabling a more accurate profiling and a detailed analysis of the different types of instructions executed within a block. Therefore, block-level analysis can be advantageous for performance enhancing mechanisms, as it allows us to look at how the instructions influence each other, and thus detect complex behavior patterns.In this paper, we present the Dynamic Block-Level Execution Profiler (DBLEP), a basic block level online mechanism that profiles micro-architectural bottlenecks, such as delinquent memory loads, hard-to-predict branches and contention for functional units. DBLEP operates at the basic block level and provides information that can be used to reduce the impact of these bottlenecks. A prefetch dropping scheme and a memory controller policy were developed to use the code profiling information provided by DBLEP. By taking advantage of the high profiling accuracy, these mechanisms are able to improve the processor’s performance by up to 18.6% (5.3% on average). We show that our mechanism’s performance is comparable to mechanisms that work on single instruction granularity, using less hardware.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 54, May 2016, Pages 15–28

نویسندگان

Francis B. Moreira, Marco A.Z. Alves, Matthias Diener, Philippe O.A. Navaux, Israel Koren,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : پروفیل اجرا در سطح بلوک

دسترسی سریع

ارتباط

English Website