Equipping Sparse Solvers for Exascale II (ESSEX-II) (SPPEXA)

Third Party Funds Group - Sub project


Acronym: SPPEXA

Start date : 01.01.2016

End date : 31.12.2018

Website: https://blogs.fau.de/essex/activities


Overall project details

Overall project

SPP 1648: Software for Exascale Computing

Project details

Scientific Abstract

The ESSEX-II project will use the successful concepts and software
blueprints developed in ESSEX-I for sparse eigenvalue solvers to
produce widely usable and scalable software solutions with high
hardware efficiency for the computer architectures of the upcoming
decade. All activities are organized along the traditional software
layers of low-level parallel building blocks (kernels), algorithm
implementations, and applications. However, the classic abstraction
boundaries separating these layers are broken in ESSEX-II by
strongly integrating objectives: scalability, numerical reliability, fault
tolerance, and holistic performance and power engineering. Driven by
Moores Law and power dissipation constraints, computer systems will
become more parallel and heterogeneous even on the node level in
upcoming years, further increasing overall system parallelism. MPI+X
programming models can be adapted in flexible ways to the
underlying hardware structure and are widely expected to be able to
address the challenges of the massively multi-level parallel
heterogeneous architectures of the next decade. Consequently, the
parallel building blocks layer supports MPI+X, with X being a
combination of node-level programming models able to fully exploit
hardware heterogeneity, functional parallelism, and data parallelism.
In addition, facilities for fully asynchronous checkpointing, silent data
corruption detection and correction, performance assessment,
performance model validation, and energy measurements will be
provided. The algorithms layer will leverage the components in the
building blocks layer to deliver fully heterogeneous, automatically
fault-tolerant, and state-of-the-art implementations of Jacobi-Davidson
eigensolvers, the Kernel Polynomial Method (KPM), and Chebyshev
Time Propagation (ChebTP) that are ready to use for production on
modern heterogeneous compute nodes with best performance and
numerical accuracy. Chebyshev filter diagonalization (ChebFD) and a
Krylov eigensolver complement these implementations, and the
recent FEAST method will be investigated and further developed for
improved scalability. The applications layer will deliver scalable
solutions for conservative (Hermitian) and dissipative (non-Hermitian)
quantum systems with strong links to optics and biology and to novel
materials such as graphene and topological insulators. Extending its
predecessor project, ESSEX-II adopts an additional focus on
production-grade software. Although the selection of algorithms is
strictly motivated by quantum physics application scenarios, the
underlying research directions of algorithmic and hardware efficiency,
accuracy, and resilience will radiate into many fields of computational
science. Most importantly, all developments will be accompanied by
an uncompromising performance engineering process that will
rigorously expose any discrepancy between expected and observed
resource efficiency.

Involved:

Contributing FAU Organisations:

Funding Source

Research Areas