An approach for analyzing auto-vectorization potential of emerging workloads

Article ID	Journal	Published Year	Pages	File Type
4956782	Microprocessors and Microsystems	2017	11 Pages	PDF

Abstract

This paper presents an analytical study on PARSEC benchmark suite in order to examine the auto-vectorization potential of emerging workloads by ICC and GCC compilers. For investigating auto-vectorization potential, we have analyzed the amount of vectorized and non-vectorized loops and the number of vector instructions of application. We have found most of the time-consuming loops of the applications have not been vectorized. Then, we have modified the applications and profiled them again. We have shown applying the modifications have a considerable effect on the amount of vectorized loops but the number of instructions has not reduced to what we expect because of the limited size of SIMD-width of current processors. As a result, in addition to applying some algorithmic methods such as loop unrolling, splitting large loops, definition of data structures, replacing function calls in loops with function bodies removing control flows from the loops in possible cases and so on to help the compilers for auto-vectorization, increasing the SIMD-width of the vector extension of CPUs is an important issue in order to improve the speed and performance.

Keywords

Profiling