کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
524343 868615 2012 27 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Elastic computing: A portable optimization framework for hybrid computers
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Elastic computing: A portable optimization framework for hybrid computers
چکیده انگلیسی

Due to power limitations and escalating cooling costs, high-performance computing systems can no longer rely solely on faster clock frequencies and numerous microprocessor nodes to meet increasing performance demands. As an alternative approach, high-performance systems are increasingly integrating multi-core processors and heterogeneous accelerators such as GPUs and FPGAs. However, usage of such hybrid systems has been limited largely to device experts due to significantly increased application design complexity. To enable more transparent usage of hybrid systems, we introduce elastic computing, which is an optimization framework where application designers invoke specialized elastic functions that contain a knowledge-base of implementation alternatives and parallelization strategies. For each elastic function, a collection of optimization tools analyze numerous possible implementations which enables dynamic and transparent optimization for different resources and run-time parameters. In this paper, we present the enabling technologies of elastic computing, and evaluate those technologies on four different hybrid systems, including the Novo-G FPGA supercomputer. The results include detailed case studies of using elastic computing for time-domain convolution and sum of absolute difference image retrieval, which achieved speedups up to 206x.


► Elastic computing enables transparent optimization of functions for hybrid systems.
► Implementation assessment creates performance estimators to evaluate optimizations.
► Work parallelization planning predetermines how to efficiently parallelize work.
► Techniques evaluated on Novo-G and a system with 4 cores, 4 GPUs, and an FPGA.
► Numerous functions were evaluated with speedup up to 206x versus single-threaded.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 38, Issue 8, August 2012, Pages 438–464
نویسندگان
, ,