Article ID Journal Published Year Pages File Type
486677 Procedia Computer Science 2012 10 Pages PDF
Abstract

In this paper we will present a detailed study on tuning double-precision matrix-matrix multiplication (DGEMM) on the Intel Xeon E5-2680 CPU. We selected an optimal algorithm from the instruction set perspective as well software tools optimized for Intel Advance Vector Extensions (AVX). Our optimizations included the use of vector memory operations, and AVX instructions. Our proposed algorithm achieves a performance improvement of 33% compared to the latest results achieved using the Intel Math Kernel Library DGEMM subroutine.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)