کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
523800 868496 2015 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Intel Cilk Plus for complex parallel algorithms: “Enormous Fast Fourier Transforms” (EFFT) library
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Intel Cilk Plus for complex parallel algorithms: “Enormous Fast Fourier Transforms” (EFFT) library
چکیده انگلیسی


• We develop a new library EFFT optimized for large 1-D discrete Fourier transforms.
• EFFT performs 1.1 × –1.5 × faster than industry-leading Intel Math Kernel Library.
• Intel Cilk Plus proves effective for multi-level nested parallelism and for automatic vectorization.
• Optimizations discussed in the paper are extensible to other problems with complex patterns of parallelism and memory access.

In this paper we demonstrate the methodology for parallelizing the computation of large one-dimensional discrete fast Fourier transforms (DFFTs) on multi-core Intel Xeon processors. DFFTs based on the recursive Cooley–Tukey method have to control cache utilization, memory bandwidth and vector hardware usage, and at the same time scale across multiple threads or compute nodes. Our method builds on a single-threaded Intel Math Kernel Library (MKL) implementation of real-to-complex DFFT, and uses the Intel Cilk Plus framework for thread parallelism. We demonstrate the ability of Intel Cilk Plus to handle parallel recursion with nested loop-centric parallelism without tuning the code to the number of cores or cache metrics. The result of our work is a library called EFFT that performs 1D DFTs of size 2N for N ≥ 21 faster than the corresponding Intel MKL parallel DFT implementation by up to 1.5 × , and faster than FFTW by up to 2.5x. The code of EFFT is available for free download under the GPLv3 license.This work provides a new efficient DFFT implementation, and at the same time demonstrates an educational example of how computer science problems with complex parallel patterns can be optimized for high performance using the Intel Cilk Plus framework.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 48, October 2015, Pages 125–142
نویسندگان
, ,