Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
433432 | Science of Computer Programming | 2013 | 14 Pages |
This paper presents a compilation technique that performs the automatic parallelization of canonical loops. Canonical loops are a recurring pattern that we have observed in many well known algorithms, such as frequent itemset, K-means and K nearest neighbors. Our compiler translates C code to sequences of stream filters that communicate through a variety of channel types. We analyze code containing canonical loops, separate the data over a cluster of processors and determine suitable communication strategies between these processors. Experiments performed on a cluster of 36 computers show that, for the three algorithms described above, our method produces speed-ups that are almost linear on the number of available processors. These experiments also show that the code automatically generated is competitive when compared to hand tuned programs.