Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
490484 | Procedia Computer Science | 2013 | 10 Pages |
Abstract
In this work the behavior of the multithreaded implementation of some LAPACK routines on PLASMA and Intel MKL is analyzed. The main goal is to develop a methodology for the installation and modelling of shared-memory linear algebra routines so that some decisions to reduce the execution time can be taken at running time. Typical decisions are: the number of threads to use, the block or tile size in algorithms by blocks or tiles, and the routine to use when there are several algorithms or implementations to solve the problem available. Experiments carried out with PLASMA and Intel MKL show that decisions can be taken automatically and satisfactory execution times are obtained.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Science (General)