کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4956763 | 1444591 | 2017 | 19 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Scalable shared-memory architecture to solve the Knapsack 0/1 problem
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
چکیده انگلیسی
In this paper we propose a shared memory, parametrizable architecture to compute the DP matrix for the Knapsack 0/1. Our system has a parallel runtime of Î(mC/q) for a knapsack of capacity C with m items and q operators. Using additional off-chip space and DMA transfers it can solve knapsacks of any size. The architecture is implemented on the ZYNQ-7020 System On Chip (SoC) that contains a dual-core ARM plus Artix FPGA fabric. Under such architecture we make use of 64-bit High Performance ports for off-chip transfers and asymmetric 32-bit write/64-bit read BRAMs to minimize data exchange times. We also exploit computation synchronization to minimize BRAM address propagation and reduce routing congestion. We present results for a base system with 70 Processing Elements (PEs) capable of solving problems with a maximum item weight Ïmax=1024. For more complex instances we configure the architecture with 58 PEs and Ïmax=6144, where a single BRAM is shared among 13 computing units. We thus solve problems with 6ââ¯Ã⯠bigger weights than previous works, attain a 16ââ¯Ã⯠speed-up versus an optimized software on an Intel Xeon E5 and get the highest efficiency per core versus other architectures. We achieve between 2.4â3.3à acceleration versus previous FPGA solutions.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Microprocessors and Microsystems - Volume 50, May 2017, Pages 189-201
Journal: Microprocessors and Microsystems - Volume 50, May 2017, Pages 189-201
نویسندگان
Fernando A. Escobar, Anthony Kolar, Naim Harb, Filipe Vinci Dos Santos, Carlos Valderrama,