OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes

Kenter T, Mahale G, Alhaddad S, Grynko Y, Schmitt C, Afzal A, Hannig F, Förstner J, Plessl C (2018)


Publication Language: English

Publication Type: Conference contribution, Conference Contribution

Publication year: 2018

Publisher: ACM

Pages Range: 189-196

Conference Proceedings Title: Proceedings of the 26th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM)

Event location: Boulder, CO, USA US

DOI: 10.1109/FCCM.2018.00037

Abstract

The exploration of FPGAs as accelerators for scientific simulations has so far mostly been focused on small kernels of methods working on regular data structures, for example in the form of stencil computations for finite difference methods. In computational sciences, often more advanced methods are employed that promise better stability, convergence, locality and scaling. Unstructured meshes are shown to be more effective and more accurate, compared to regular grids, in representing computation domains of various shapes. Using unstructured meshes, the discontinuous Galerkin method preserves the ability to perform explicit local update operations for simulations in the time domain. In this work, we investigate FPGAs as target platform for an implementation of the nodal discontinuous Galerkin method to find time-domain solutions of Maxwell's equations in an unstructured mesh. When maximizing data reuse and fitting constant coefficients into suitably partitioned on-chip memory, high computational intensity allows us to implement and feed wide data paths with hundreds of floating point operators. By decoupling off-chip memory accesses from the computations, high memory bandwidth can be sustained, even for the irregular access pattern required by parts of the application. Using the Intel/Altera OpenCL SDK for FPGAs, we present different implementation variants for different polynomial orders of the method. In different phases of the algorithm, either computational or bandwidth limits of the Arria 10 platform are almost reached, thus outperforming a highly multithreaded CPU implementation by around 2x.


Authors with CRIS profile

Related research project(s)

How to cite

APA:

Kenter, T., Mahale, G., Alhaddad, S., Grynko, Y., Schmitt, C., Afzal, A.,... Plessl, C. (2018). OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes. In Proceedings of the 26th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM) (pp. 189-196). Boulder, CO, USA, US: ACM.

MLA:

Kenter, Tobias, et al. "OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes." Proceedings of the The 26th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), Boulder, CO, USA ACM, 2018. 189-196.

BibTeX: Download