OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes

Beitrag bei einer Tagung
(Konferenzbeitrag)


Details zur Publikation

Autor(en): Kenter T, Mahale G, Alhaddad S, Grynko Y, Schmitt C, Afzal A, Hannig F, Förstner J, Plessl C
Verlag: ACM
Jahr der Veröffentlichung: 2018
Tagungsband: Proceedings of the 26th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM)
Sprache: Englisch


Abstract

The exploration of FPGAs as accelerators for scientific simulations has so far mostly been focused on small kernels of methods working on regular data structures, for example in the form of stencil computations for finite difference methods. In computational sciences, often more advanced methods are employed that promise better stability, convergence, locality and scaling. Unstructured meshes are shown to be more effective and more accurate, compared to regular grids, in representing computation domains of various shapes. Using unstructured meshes, the discontinuous Galerkin method preserves the ability to perform explicit local update operations for simulations in the time domain. In this work, we investigate FPGAs as target platform for an implementation of the nodal discontinuous Galerkin method to find time-domain solutions of Maxwell's equations in an unstructured mesh. When maximizing data reuse and fitting constant coefficients into suitably partitioned on-chip memory, high computational intensity allows us to implement and feed wide data paths with hundreds of floating point operators. By decoupling off-chip memory accesses from the computations, high memory bandwidth can be sustained, even for the irregular access pattern required by parts of the application. Using the Intel/Altera OpenCL SDK for FPGAs, we present different implementation variants for different polynomial orders of the method. In different phases of the algorithm, either computational or bandwidth limits of the Arria 10 platform are almost reached, thus outperforming a highly multithreaded CPU implementation by around 2x.



FAU-Autoren / FAU-Herausgeber

Afzal, Ayesha
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)
Hannig, Frank PD Dr.-Ing.
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)
Schmitt, Christian
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)


Zitierweisen

APA:
Kenter, T., Mahale, G., Alhaddad, S., Grynko, Y., Schmitt, C., Afzal, A.,... Plessl, C. (2018). OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes. In Proceedings of the 26th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM). Boulder, CO, USA, US: ACM.

MLA:
Kenter, Tobias, et al. "OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes." Proceedings of the The 26th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), Boulder, CO, USA ACM, 2018.

BibTeX: 

Zuletzt aktualisiert 2018-16-08 um 18:29