Equipping Sparse Solvers for Exascale II (ESSEX-II)

Third Party Funds Group - Sub project

Overall project details

Overall project: SPP 1648: Software for Exascale Computing


Project Details

Project leader:
Prof. Dr. Gerhard Wellein

Project members:
Faisal Shahzad

Contributing FAU Organisations:
Professur für Höchstleistungsrechnen
Regionales Rechenzentrum Erlangen (RRZE)

Funding source: DFG / Schwerpunktprogramm (SPP)
Acronym: SPPEXA
Start date: 01/01/2016
End date: 31/12/2018


Research Fields

Hardwareeffiziente Bausteine für dünn besetzte lineare Algebra und stencil-basierten Verfahren
Professur für Höchstleistungsrechnen


Abstract (technical / expert description):


The ESSEX-II project will use the successful concepts and software

blueprints developed in ESSEX-I for sparse eigenvalue solvers to

produce widely usable and scalable software solutions with high

hardware efficiency for the computer architectures of the upcoming

decade. All activities are organized along the traditional software

layers of low-level parallel building blocks (kernels), algorithm

implementations, and applications. However, the classic abstraction

boundaries separating these layers are broken in ESSEX-II by

strongly integrating objectives: scalability, numerical reliability, fault

tolerance, and holistic performance and power engineering. Driven by

Moores Law and power dissipation constraints, computer systems will

become more parallel and heterogeneous even on the node level in

upcoming years, further increasing overall system parallelism. MPI+X

programming models can be adapted in flexible ways to the

underlying hardware structure and are widely expected to be able to

address the challenges of the massively multi-level parallel

heterogeneous architectures of the next decade. Consequently, the

parallel building blocks layer supports MPI+X, with X being a

combination of node-level programming models able to fully exploit

hardware heterogeneity, functional parallelism, and data parallelism.

In addition, facilities for fully asynchronous checkpointing, silent data

corruption detection and correction, performance assessment,

performance model validation, and energy measurements will be

provided. The algorithms layer will leverage the components in the

building blocks layer to deliver fully heterogeneous, automatically

fault-tolerant, and state-of-the-art implementations of Jacobi-Davidson

eigensolvers, the Kernel Polynomial Method (KPM), and Chebyshev

Time Propagation (ChebTP) that are ready to use for production on

modern heterogeneous compute nodes with best performance and

numerical accuracy. Chebyshev filter diagonalization (ChebFD) and a

Krylov eigensolver complement these implementations, and the

recent FEAST method will be investigated and further developed for

improved scalability. The applications layer will deliver scalable

solutions for conservative (Hermitian) and dissipative (non-Hermitian)

quantum systems with strong links to optics and biology and to novel

materials such as graphene and topological insulators. Extending its

predecessor project, ESSEX-II adopts an additional focus on

production-grade software. Although the selection of algorithms is

strictly motivated by quantum physics application scenarios, the

underlying research directions of algorithmic and hardware efficiency,

accuracy, and resilience will radiate into many fields of computational

science. Most importantly, all developments will be accompanied by

an uncompromising performance engineering process that will

rigorously expose any discrepancy between expected and observed

resource efficiency.



External Partners

Universität Greifswald
German Aerospace Center / Deutsches Zentrum für Luft- und Raumfahrt e.V. (DLR)
Bergische Universität Wuppertal
University of Tokyo
University of Tsukuba / 筑波大学


Publications

Kreutzer, M., Ernst, D., Bishop, A.R., Fehske, H., Hager, G., Nakajima, K., & Wellein, G. (2018). Chebyshev filter diagonalization on modern manycore processors and GPGPUs. Springer Verlag.
Shahzad, F., Thies, J., Kreutzer, M., Zeiser, T., Hager, G., & Wellein, G. (2018). CRAFT: A library for easier application-level Checkpoint/Restart and Automatic Fault Tolerance. IEEE Transactions on Parallel and Distributed Systems. https://dx.doi.org/10.1109/TPDS.2018.2866794
Anzt, H., Kreutzer, M., Ponce, E., Peterson, G.D., Wellein, G., & Dongarra, J. (2018). Optimization and performance evaluation of the IDR iterative Krylov solver on GPUs. International Journal of High Performance Computing Applications, 32(2), 220-230. https://dx.doi.org/10.1177/1094342016646844
Galgon, M., Krämer, L., Lang, B., Alvermann, A., Fehske, H., Pieper, A.,... Thies, J. (2017). Improved coefficients for polynomial filtering in ESSEX. In Proceedings of the 1st InternationalWorkshop on Eigenvalue Problems: Algorithms, Software and Applications in Petascale Computing, EPASA 2015 (pp. 63-79). Springer Verlag.
Anzt, H., Gates, M., Dongarra, J., Kreutzer, M., Wellein, G., & Köhler, M. (2017). Preconditioned Krylov solvers on GPUs. Parallel Computing, 68, 32-44. https://dx.doi.org/10.1016/j.parco.2017.05.006
Anzt, H., Dongarra, J., Kreutzer, M., Wellein, G., & Köhler, M. (2016). Efficiency of general Krylov methods on GPUs - An experimental study. In Proceedings of the 30th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2016 (pp. 683-691). IEEE Computer Society.
Kreutzer, M., Thies, J., Röhrig-Zöllner, M., Pieper, A., Shahzad, F., Galgon, M.,... Wellein, G. (2016). GHOST: Building Blocks for High Performance Sparse Linear Algebra on Heterogeneous Systems. International Journal of Parallel Programming, 1-27. https://dx.doi.org/10.1007/s10766-016-0464-z
Wellein, G., Alvermann, A., Fehske, H., Hager, G., Kreutzer, M., Lang, B.,... Galgon, M. (2016). High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations. Journal of Computational Physics, 325, 226-243. https://dx.doi.org/10.1016/j.jcp.2016.08.027

Last updated on 2018-22-11 at 18:01