کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
433026 689211 2014 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
PMSS: A programmable memory system and scheduler for complex memory patterns
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
PMSS: A programmable memory system and scheduler for complex memory patterns
چکیده انگلیسی


• In this article, we propose PMSS a Programmable Memory System and Scheduler.
• The PMSS can operates without intervention of master core or Operating system.
• It schedules multi-accelerators and manages their memory access patterns.
• The system is evaluated with memory intensive accelerators tested on a Xilinx ML505 evaluation FPGA board.
• Results show that the PMSS system achieves 19x of speed-up compared to generic multi-accelerator system.

HPC industry demands more computing units on FPGAs, to enhance the performance by using task/data parallelism. FPGAs can provide its ultimate performance on certain kernels by customizing the hardware for the applications. However, applications are getting more complex, with multiple kernels and complex data arrangements, generating overhead while scheduling/managing system resources. Due to this reason all classes of multi threaded machines–minicomputer to supercomputer–require to have efficient hardware scheduler and memory manager that improves the effective bandwidth and latency of the DRAM main memory. This architecture could be a very competitive choice for supercomputing systems that meets the demand of parallelism for HPC benchmarks. In this article, we proposed a Programmable Memory System and Scheduler (PMSS), which provides high speed complex data access pattern to the multi threaded architecture. This proposed PMSS system is implemented and tested on a Xilinx ML505 evaluation FPGA board. The performance of the system is compared with a microprocessor based system that has been integrated with the Xilkernel operating system. Results show that the modified PMSS based multi-accelerator system consumes 50% less hardware resources, 32% less on-chip power and achieves approximately a 19x speedup compared to the MicroBlaze based system.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Parallel and Distributed Computing - Volume 74, Issue 10, October 2014, Pages 2983–2993
نویسندگان
, , ,