کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
523766 | 868488 | 2016 | 21 صفحه PDF | دانلود رایگان |
• Linear Assignment is one of the most fundamental problems in operations research.
• A creative parallelization of a Hungarian-like algorithm on GPU cluster.
• Efficient parallelization of the augmenting path search step.
• Large problems with 1.6 billion variables can be solved.
• It is probably the fastest LAP solver using a GPU.
In this paper, we describe parallel versions of two different variants (classical and alternating tree) of the Hungarian algorithm for solving the Linear Assignment Problem (LAP). We have chosen Compute Unified Device Architecture (CUDA) enabled NVIDIA Graphics Processing Units (GPU) as the parallel programming architecture because of its ability to perform intense computations on arrays and matrices. The main contribution of this paper is an efficient parallelization of the augmenting path search phase of the Hungarian algorithm. Computational experiments on problems with up to 25 million variables reveal that the GPU-accelerated versions are extremely efficient in solving large problems, as compared to their CPU counterparts. Tremendous parallel speedups are achieved for problems with up to 400 million variables, which are solved within 13 seconds on average. We also tested multi-GPU versions of the two variants on up to 16 GPUs, which show decent scaling behavior for problems with up to 1.6 billion variables and dense cost matrix structure.
Journal: Parallel Computing - Volume 57, September 2016, Pages 52–72