کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
432370 688869 2013 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Combining multi-core and GPU computing for solving combinatorial optimization problems
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Combining multi-core and GPU computing for solving combinatorial optimization problems
چکیده انگلیسی


• Addressing the design and implementation of B&B algorithms for GPU-enhanced multi-core environments.
• Dealing with irregular tree-based search algorithm on SIMD architecture.
• Proposing new patterns for combining multi-core and GPU computing for B&B.
• Minimizing the CPU–GPU communication overhead.

In this paper, we revisit the design and implementation of Branch-and-Bound (B&B) algorithms for solving large combinatorial optimization problems on GPU-enhanced multi-core machines. B&B is a tree-based optimization method that uses four operators (selection, branching, bounding and pruning) to build and explore a highly irregular tree representing the solution space. In our previous works, we have proposed a GPU-accelerated approach in which only a single CPU core is used and only the bounding operator is performed on the GPU device. Here, we extend the approach (LL-GB&B) in order to minimize the CPU–GPU communication latency and thread divergence. Such an objective is achieved through a GPU-based fine-grained parallelization of the branching and pruning operators in addition to the bounding one. The second contribution consists in investigating the combination of a GPU with multi-core processing. Two scenarios have been explored leading to two approaches: a concurrent (RLL-GB&B) and a cooperative one (PLL-GB&B). In the first one, the exploration process is performed concurrently by the GPU and the CPU cores. In the cooperative approach, the CPU cores prepare and off-load to GPU pools of tree nodes using data streaming while the GPU performs the exploration. The different approaches have been extensively experimented on the Flowshop scheduling problem. Compared to a single CPU-based execution, LL-GB&B allows accelerations up to (××160) for large problem instances. Moreover, when combining multi-core and GPU, we figure out that using RLL-GB&B is not beneficial while PLL-GB&B enables an improvement up to 36% compared to LL-GB&B.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Parallel and Distributed Computing - Volume 73, Issue 12, December 2013, Pages 1563–1577
نویسندگان
, , , ,