Enhancement of membrane computing model implementation on GPU by introducing matrix representation for balancing occupancy and reducing inter-block communications

Article ID	Journal	Published Year	Pages	File Type
429523	Journal of Computational Science	2014	11 Pages	PDF

Abstract

Highlight•Multiprocessor occupancy is an important factor in the performance of a GPU.•Communications of thread blocks needs kernel invocation that is time consuming.•The new matrix representation of membrane systems increases occupancy of a GPU.•The new matrix representation decreases communications of thread blocks.•The new algorithm based on matrix representation shows better performance than other.

In previous studies, objects of each membrane were assigned to threads of one thread block of the graphic processing unit (GPU). The number of active threads was low if the number of objects inside a membrane was low. This study represents objects of membranes as entities of a matrix. Then a sub-matrix represents the appropriate number of objects assigned to threads of each thread block to balance the load and keep the occupancy high even when the number of objects per membrane is low. The size of the sub-matrix or the appropriate number of active threads is determined automatically. Furthermore, by this approach it is possible to assign more than one membrane to each thread block and to execute communication between membranes in the same thread block without the need for time-consuming inter-block communication. For example, using the previous algorithm, for two objects per membrane the speed up is 0.6×, while for the proposed algorithm the speed up is 32.4×.

Keywords

P systems Membrane computing Graphics Processing Unit Parallel processing