Scalable Multi-FPGA Design of a Discontinuous Galerkin Shallow-Water Model on Unstructured Meshes

Faj J, Kenter T, Faghih-Naini S, Plessl C, Aizinger V (2023)


Publication Type: Conference contribution

Publication year: 2023

Publisher: Association for Computing Machinery

City/Town: New York

Book Volume: 8

Pages Range: 1–12

Event location: Davos CH

ISBN: 979-8-4007-0190-0

DOI: 10.1145/3592979.3593407

Abstract

FPGAs are fostering interest as energy-efficient accelerators for scientific simulations, including for methods operating on unstructured meshes. Considering the potential impact on high-performance computing, specific attention needs to be given to the scalability of such approaches. In this context, the networking capabilites of FPGA hardware and software stacks can play a crucial role to enable solutions that go beyond a traditional host-MPI and accelerator-offload model.

In this work, we present the multi-FPGA scaling of a discontinuous Galerkin shallow water model using direct low-latency streaming communication between the FPGAs. To this end, the unstructured mesh defining the spatial domain of the simulation is partitioned, the inter-FPGA network is configured to match the topology of neighboring partitions, and halo communication is overlapped with the dataflow computation pipeline. With this approach, we demonstrate strong scaling on up to eight FPGAs with a parallel efficiency of >80% and execution times per time step of as low as 7.6 μs. At the same time, with weak scaling, the approach allows to simulate larger meshes that would exceed the local memory limits of a single FPGA, now supporting meshes up to more than 100,000 elements and reaching an aggregated performance of up to 6.5 TFLOPs. Finally, a hierarchical partitioning approach allows for better utilization of the FPGA compute resources in some designs and, by mitigating limitations posed by the communication topology, enables simulations with up to 32 partitions on 8 FPGAs.

Authors with CRIS profile

Involved external institutions

How to cite

APA:

Faj, J., Kenter, T., Faghih-Naini, S., Plessl, C., & Aizinger, V. (2023). Scalable Multi-FPGA Design of a Discontinuous Galerkin Shallow-Water Model on Unstructured Meshes. In Proceedings of the Platform for Advanced Scientific Computing Conference (pp. 1–12). Davos, CH: New York: Association for Computing Machinery.

MLA:

Faj, Jennifer, et al. "Scalable Multi-FPGA Design of a Discontinuous Galerkin Shallow-Water Model on Unstructured Meshes." Proceedings of the Platform for Advanced Scientific Computing Conference, Davos New York: Association for Computing Machinery, 2023. 1–12.

BibTeX: Download