From Loop Fusion to Kernel Fusion: A Domain-specific Approach to Locality Optimization

Beitrag bei einer Tagung
(Originalarbeit)


Details zur Publikation

Autorinnen und Autoren: Qiao B, Reiche O, Hannig F, Teich J
Jahr der Veröffentlichung: 2019
Tagungsband: Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization
Seitenbereich: 242-253
ISBN: 978-1-7281-1436-1
Sprache: Englisch


Abstract

Optimizing data-intensive applications such as image processing for GPU targets with complex memory hierarchies requires to explore the tradeoffs among locality, parallelism, and computation. Loop fusion as one of the classical optimization techniques has been proven effective to improve locality at the function level. Algorithms in image processing are increasing their complexities and generally consist of many kernels in a pipeline. The inter-kernel communications are intensive and exhibit another opportunity for locality improvement at the system level. The scope of this paper is an optimization technique called kernel fusion for data locality improvement. We present a formal description of the problem by defining an objective function for locality optimization. By transforming the fusion problem to a graph partitioning problem, we propose a solution based on the minimum cut technique to search fusible kernels recursively. In addition, we develop an analytic model to quantitatively estimate potential locality improvement by incorporating domain-specific knowledge and architecture details. The proposed technique is implemented in an image processing DSL and source-to-source compiler called Hipacc, and evaluated over six image processing applications on three Nvidia GPUs. A geometric mean speedup of up to 2.52 can be observed in our experiments.


FAU-Autorinnen und Autoren / FAU-Herausgeberinnen und Herausgeber

Hannig, Frank PD Dr.-Ing.
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)
Qiao, Bo
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)
Reiche, Oliver
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)
Teich, Jürgen Prof. Dr.-Ing.
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)


Zitierweisen

APA:
Qiao, B., Reiche, O., Hannig, F., & Teich, J. (2019). From Loop Fusion to Kernel Fusion: A Domain-specific Approach to Locality Optimization. In Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization (pp. 242-253). Washington DC, USA, US.

MLA:
Qiao, Bo, et al. "From Loop Fusion to Kernel Fusion: A Domain-specific Approach to Locality Optimization." Proceedings of the 2019 International Symposium on Code Generation and Optimization (CGO19), Washington DC, USA 2019. 242-253.

BibTeX: 

Zuletzt aktualisiert 2019-10-03 um 12:38