A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units

Beitrag in einer Fachzeitschrift
(Originalarbeit)


Details zur Publikation

Autorinnen und Autoren: Kreutzer M, Hager G, Wellein G, Fehske H, Bishop AR
Zeitschrift: SIAM Journal on Scientific Computing
Verlag: Society for Industrial and Applied Mathematics
Jahr der Veröffentlichung: 2014
Band: 36
Heftnummer: 5
Seitenbereich: C401–C423
ISSN: 1064-8275


Abstract


Sparse matrix-vector multiplication (spMVM) is the most time-consuming kernel in many numerical algorithms and has been studied extensively on all modern processor and accelerator architectures. However, the optimal sparse matrix data storage format is highly hardware-specific, which could become an obstacle when using heterogeneous systems. Also, it is as yet unclear how the wide single instruction multiple data (SIMD) units in current multi- and many-core processors should be used most efficiently if there is no structure in the sparsity pattern of the matrix.We suggest SELLC- s, a variant of Sliced ELLPACK, as a SIMD-friendly data format which combines long-standing ideas from general-purpose graphics processing units and vector computer programming. We discuss the advantages of SELL-C-s compared to established formats like Compressed Row Storage and ELLPACK and show its suitability on a variety of hardware platforms (Intel Sandy Bridge, Intel Xeon Phi, and Nvidia Tesla K20) for a wide range of test matrices from different application areas. Using appropriate performance models we develop deep insight into the data transfer properties of the SELL-C-s spMVM kernel. SELL-C-s comes with two tuning parameters whose performance impact across the range of test matrices is studied and for which reasonable choices are proposed. This leads to a hardware-independent ("catch-all") sparse matrix format, which achieves very high efficiency for all test matrices across all hardware platforms.



FAU-Autorinnen und Autoren / FAU-Herausgeberinnen und Herausgeber

Hager, Georg Dr.
Regionales Rechenzentrum Erlangen (RRZE)
Kreutzer, Moritz
Regionales Rechenzentrum Erlangen (RRZE)
Wellein, Gerhard Prof. Dr.
Professur für Höchstleistungsrechnen


Einrichtungen weiterer Autorinnen und Autoren

Los Alamos National Laboratory
Universität Greifswald


Forschungsbereiche

Hardwareeffiziente Bausteine für dünn besetzte lineare Algebra und stencil-basierten Verfahren
Professur für Höchstleistungsrechnen


Zitierweisen

APA:
Kreutzer, M., Hager, G., Wellein, G., Fehske, H., & Bishop, A.R. (2014). A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units. SIAM Journal on Scientific Computing, 36(5), C401–C423. https://dx.doi.org/10.1137/130930352

MLA:
Kreutzer, Moritz, et al. "A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units." SIAM Journal on Scientific Computing 36.5 (2014): C401–C423.

BibTeX: 

Zuletzt aktualisiert 2018-09-08 um 17:08