کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4964478 1447807 2017 33 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A high performance data parallel tensor contraction framework: Application to coupled electro-mechanics
ترجمه فارسی عنوان
یک داده با کارایی بالا چارچوب انقباض تانسور موازی: کاربرد الکترومکانیکی متصل
کلمات کلیدی
انقباض تنسور، موازی داده ها، الگوهای ابرمتن دامنه، الکتریکی مکانیکی غیر خطی،
موضوعات مرتبط
مهندسی و علوم پایه شیمی شیمی تئوریک و عملی
چکیده انگلیسی
The paper presents aspects of implementation of a new high performance tensor contraction framework for the numerical analysis of coupled and multi-physics problems on streaming architectures. In addition to explicit SIMD instructions and smart expression templates, the framework introduces domain specific constructs for the tensor cross product and its associated algebra recently rediscovered by Bonet et al. (2015, 2016) in the context of solid mechanics. The two key ingredients of the presented expression template engine are as follows. First, the capability to mathematically transform complex chains of operations to simpler equivalent expressions, while potentially avoiding routes with higher levels of computational complexity and, second, to perform a compile time depth-first or breadth-first search to find the optimal contraction indices of a large tensor network in order to minimise the number of floating point operations. For optimisations of tensor contraction such as loop transformation, loop fusion and data locality optimisations, the framework relies heavily on compile time technologies rather than source-to-source translation or JIT techniques. Every aspect of the framework is examined through relevant performance benchmarks, including the impact of data parallelism on the performance of isomorphic and nonisomorphic tensor products, the FLOP and memory I/O optimality in the evaluation of tensor networks, the compilation cost and memory footprint of the framework and the performance of tensor cross product kernels. The framework is then applied to finite element analysis of coupled electro-mechanical problems to assess the speed-ups achieved in kernel-based numerical integration of complex electroelastic energy functionals. In this context, domain-aware expression templates combined with SIMD instructions are shown to provide a significant speed-up over the classical low-level style programming techniques.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Physics Communications - Volume 216, July 2017, Pages 35-52
نویسندگان
, , ,