Tuning basic Linear Algebra Routines for Hybrid CPU+GPU Platforms

Article ID	Journal	Published Year	Pages	File Type
490200	Procedia Computer Science	2014	10 Pages	PDF

Abstract

The introduction of auto-tuning techniques in linear algebra routines using hybrid combinations of multiple CPU and GPU computing resources is analyzed. Basic models of the execution time and information obtained during the installation of the routines are used to optimize the execution time with a balanced assignation of the work to the computing components in the system. The study is carried out with a basic kernel (matrix-matrix multiplication) and a higher level routine (LU factorization) using GPUs and the host multicore processor. Satisfactory results are obtained, with experimental execution times close to the lowest experimentally achievable.