A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units
Author(s): Kreutzer M, Hager G, Wellein G, Fehske H, Bishop A
Publisher: Society for Industrial and Applied Mathematics
Publication year: 2014
Journal issue: 5
Pages range: C401C423
Sparse matrix-vector multiplication (spMVM) is the most time-consuming kernel in many numerical algorithms and has been studied extensively on all modern processor and accelerator architectures. However, the optimal sparse matrix data storage format is highly hardware-specific, which could become an obstacle when using heterogeneous systems. Also, it is as yet unclear how the wide single instruction multiple data (SIMD) units in current multi- and many-core processors should be used most efficiently if there is no structure in the sparsity pattern of the matrix.We suggest SELLC- s, a variant of Sliced ELLPACK, as a SIMD-friendly data format which combines long-standing ideas from general-purpose graphics processing units and vector computer programming. We discuss the advantages of SELL-C-s compared to established formats like Compressed Row Storage and ELLPACK and show its suitability on a variety of hardware platforms (Intel Sandy Bridge, Intel Xeon Phi, and Nvidia Tesla K20) for a wide range of test matrices from different application areas. Using appropriate performance models we develop deep insight into the data transfer properties of the SELL-C-s spMVM kernel. SELL-C-s comes with two tuning parameters whose performance impact across the range of test matrices is studied and for which reasonable choices are proposed. This leads to a hardware-independent ("catch-all") sparse matrix format, which achieves very high efficiency for all test matrices across all hardware platforms.
FAU Authors / FAU Editors Focus Area of Individual Faculties How to cite
APA: Kreutzer, M., Hager, G., Wellein, G., Fehske, H., & Bishop, A. (2014). A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units. SIAM Journal on Scientific Computing, 36(5), C401C423. https://dx.doi.org/10.1137/130930352
MLA: Kreutzer, Moritz, et al. "A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units." SIAM Journal on Scientific Computing 36.5 (2014): C401C423.