We report on a two-scale approach for efficient matrix-free finite element simulations. The proposed method is based on surrogate element matrices constructed by low-order polynomial approximations. It is applied to a Stokes-type PDE system with variable viscosity as is a key component in mantle convection models. We set the ground for a rigorous performance analysis inspired by the concept of parallel textbook multigrid efficiency and study the weak scaling behavior on SuperMUC, a peta-scale supercomputer system. For a complex geodynamical model, we achieve a parallel efficiency of 95% on up to 47 250 compute cores. Our largest simulation uses a trillion () degrees of freedom for a global mesh resolution of 1.7 km.

The Stokes system with constant viscosity can be cast into different formulations by exploiting the incompressibility constraint. For instance, the rate of strain tensor in the weak formulation can be replaced by the velocity-gradient yielding a decoupling of the velocity components in the different coordinate directions. Consequently, the discretization of this partly decoupled formulation leads to fewer nonzero entries in the stiffness matrix. This is of particular interest in large scale simulations where a reduced memory bandwidth requirement can help to significantly accelerate the computations. In the case of a piecewise constant viscosity, as it typically arises in multi-phase flows, or when the boundary conditions involve traction, the situation is more complex, and one has to treat the cross derivatives in the original Stokes system with care. A naive application of the standard vectorial Laplacian results in a …

In this paper, we report on a two-scale approach for efficient matrix-free finite element simulations. It is an extended version of our previous conference publication [1]. The proposed method is based on surrogate element matrices constructed by low order polynomial approximations. It is applied to a Stokes-type PDE system with variable viscosity as is a key component in mantle convection models. We set the ground for a rigorous performance analysis inspired by the concept of parallel textbook multigrid efficiency and study the weak scaling behavior on SuperMUC, a peta-scale supercomputer system. For a complex geodynamical model, we achieve, on up to 47 250 compute cores, a parallel efficiency of 93% for application of the discrete operator and 83% for a complete Uzawa V-cycle including the coarse grid solve. Our largest simulation uses a trillion (O (10 12)) degrees of freedom for a global mesh resolution …

With the
increasing number of compute components, failures in future
exa-scale computer systems are expected to become more frequent. This motivates the study of novel resilience techniques. Here, we extend
a recently proposed
algorithm-based recovery method
for multigrid iterations
by introducing an adaptive control.
After a fault, the healthy part of the system continues the iterative solution
process, while the solution in the faulty domain is re-constructed
by an asynchronous on-line recovery. The computations in both the
faulty and healthy subdomains must be coordinated in a sensitive way, in particular, both
under and over-solving must be avoided.
Both of these
waste computational resources and will therefore increase the overall time-to-solution. To control the local recovery and guarantee an optimal
re-coupling, we introduce a stopping
criterion based on a mathematical error estimator. It involves
hierarchical weighted sums of residuals within
the context of uniformly refined
meshes and is well-suited
in the context of parallel high-performance computing. The re-coupling process
is steered by local contributions of
the error estimator. We propose and
compare two criteria which differ in
their weights. Failure scenarios
when solving up to 6*.*9 *· *1011
unknowns on more than
245 766 parallel processes will be reported
on a state-of-the-art peta-scale supercomputer
demonstrating the robustness of the method.

**Keywords: ***error estimator, high-performance computing, algorithm-based fault
tolerance, multigrid*

Due to the enormous spatial and time scales and the inaccessibility of the Earth's interior to direct measurements,

studying these processes requires a combination of sophisticated computer simulations and mostly indirect observations.

Heating inside the Earth's core and mantle causes convection currents in the solid Earth mantle, which results in a viscous flow on geological time scales of millions of years.

This mantle convection is the driving mechanism of plate tectonics, which causes mountain building, earthquakes and volcanism.

However, many details of the physical processes in Earth mantle convection are poorly known, such as appropriate rheological parameters or the mantle viscosity structure.

To allow for the use of realistic physical parameters, Earth mantle convection simulations require

extremely large grids for a sufficient resolution of the mantle volume of 10^{12} km^3 and many time steps.

These simulations are only possible with highly efficient codes that exhbit excellent parallel scalability on modern supercomputers.

In this talk, we present a framework for such large-scale time-dependent mantle convection simulations on a thick spherical shell with variable viscosity.

In the simulations a nonlinear coupled multiphysics problem of Stokes equation coupled to the energy equation is solved,

modeling the conservation of momentum, mass, and energy.

These equations are discretized with finite elements and the solution is computed in the Hierarchical Hybrid Grids (HHG) framework.

HHG combines the flexibility of unstructured tetrahedral meshes with the efficiency of structured grids for finite element discretizations.

The design of this framework is motivated by the challenging goal of achieving high performance on large-scale and parallel

finite element simulations on supercomputers. HHG exploits the performance and efficiency of nested structured grid

hierarchies and hierarchically organized data structures combined with the flexibility of unstructured grids.

To this end, HHG combines grid partitioning and regular refinement in such a way that an execution paradigm using stencils can be realized.

Within uniform blocks of the mesh three-dimensional stencils are applied in the fashion of a finite difference method.

We present transient simulation results of the temperature distribution for the coupled flow and transport problem,

as well as the stationary flow field for variable temperature-dependent viscosity with high viscosity contrasts.

Moreover, scaling results are presented to show that our approach facilitates solving systems in excess of ten trillion ($10^13$) unknowns

on Peta-Scale systems using compute times of a few minute},
author = {Bartuschat, Dominik and Rüde, Ulrich and Thoennes, Dominik and Kohl, Nils and Drzisga, Daniel Peter and Huber, Markus and John, Lorenz and Waluga, Christian and Wohlmuth, B. I. and Bauer, Simon and Mohr, Marcus and Bunge, Hans-Peter},
booktitle = {CSEConf2017 -- 2017 International Conference on Computational Science and Engineering - Software, Education, and Biomedical applications},
date = {2017-10-23/2017-10-25},
faupublication = {yes},
peerreviewed = {unknown},
title = {{A} parallel finite element multigrid framework for geodynamic simulations with more than ten trillion unknowns},
url = {https://www10.cs.fau.de/publications/talks/2017/Bartuschat_Oslo_CSEconf17_2017-10-23.pdf},
venue = {Oslo, Norwegen},
year = {2017}
}