کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4951508 | 1441474 | 2018 | 11 صفحه PDF | دانلود رایگان |
- The proposed approach significantly reduces computational time of ZMs on a GPU.
- The proposed approach supports the expansion of computation ZMs on multi-GPU.
- Eliminating huge conditional instructions after the image re-layout.
Our study focuses on accelerating the computation of Zernike moments on graphics processing units (GPUs). There are two ideas to achieve the goal. First is to implement a novel re-layout that involves reordering the image pixels and addressing the diagonal pixels in advance, so that computations of all pixels are allocated to an octant effectively. Second is to the leverage the constant memory to store precomputed values used across GPU threads. An in-depth study has been carried out to evaluate the performance in each case and to compare against GPU implementation of other algorithms and to discuss the bottleneck. The result shows that our approach is effective and achieves significant performance improvement compared to other GPU state-of-the-art implementations. Furthermore, our approach is suited for allocating the data flow into multiple GPUs.
Journal: Journal of Parallel and Distributed Computing - Volume 111, January 2018, Pages 104-114