Bit-parallel approximate pattern matching: Kepler GPU versus Xeon Phi

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
523839	868503	2016	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Xeon Phi Approximate pattern matching - تطبیق الگوی تقریبی GPU - واحد پردازش گرافیکی CUDA - کودا. پردازش موازی و مدل برنامه‌نویسی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش صفحه اول مقاله

Bit-parallel approximate pattern matching: Kepler GPU versus Xeon Phi

چکیده انگلیسی

• Advanced SIMD features on GPUs and Xeon Phis promote efficient long pattern search.
• A tiled approach to accelerating the Wu–Manber algorithm on GPUs has been proposed.
• Both the GPU and Xeon Phi yield two orders-of-magnitude speedup over one CPU core.
• The GPU-based version with tiling runs up to 2.9 × faster than the Xeon Phi version.

Approximate pattern matching (APM) targets to find the occurrences of a pattern inside a subject text allowing a limited number of errors. It has been widely used in many application areas such as bioinformatics and information retrieval. Bit-parallel APM takes advantage of the intrinsic parallelism of bitwise operations inside a machine word. This approach typically encodes non-deterministic finite automaton (NFA) states or value differences between adjacent cells of a dynamic programming matrix in the form of bit arrays. Wu–Manber (WM) is a well-known bit-parallel APM algorithm, which simulates an NFA and gains parallel efficiency by performing multiple state updates within a machine word. An important parameter is the machine word size (e.g. 32 or 64 bits for CPUs). Due to increasing vector capabilities, efficient mapping of bit-parallel APM algorithms onto modern high performance computing architectures is an interesting research topic. Prominent examples are Xeon Phi coprocessors and CUDA-enabled GPUs, which provide words of size 512 bits (by means of vector registers) and 1024 bits (by means of warps), respectively. In this paper, we investigate mappings of the WM algorithm onto these two accelerator types. Both architectures are able to achieve around two orders-of-magnitude speedups compared to a single-threaded CPU implementation. Moreover, our tile-based implementation on a GeForce Titan graphics card runs up to 2.9 × faster than our implementation on an Intel Xeon Phi 5110P. Source code is available at http://xbitpar.sourceforge.net.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 54, May 2016, Pages 128–138

نویسندگان

Tuan Tu Tran, Yongchao Liu, Bertil Schmidt,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Bit-parallel approximate pattern matching: Kepler GPU versus Xeon Phi

دسترسی سریع

ارتباط

English Website