From Loop Fusion to Kernel Fusion: A Domain-specific Approach to Locality Optimization

Conference contribution
(Original article)


Publication Details

Author(s): Qiao B, Reiche O, Hannig F, Teich J
Publication year: 2019
Conference Proceedings Title: Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization
Pages range: 242-253
ISBN: 978-1-7281-1436-1
Language: English


Abstract

Optimizing data-intensive applications such as image processing for GPU targets with complex memory hierarchies requires to explore the tradeoffs among locality, parallelism, and computation. Loop fusion as one of the classical optimization techniques has been proven effective to improve locality at the function level. Algorithms in image processing are increasing their complexities and generally consist of many kernels in a pipeline. The inter-kernel communications are intensive and exhibit another opportunity for locality improvement at the system level. The scope of this paper is an optimization technique called kernel fusion for data locality improvement. We present a formal description of the problem by defining an objective function for locality optimization. By transforming the fusion problem to a graph partitioning problem, we propose a solution based on the minimum cut technique to search fusible kernels recursively. In addition, we develop an analytic model to quantitatively estimate potential locality improvement by incorporating domain-specific knowledge and architecture details. The proposed technique is implemented in an image processing DSL and source-to-source compiler called Hipacc, and evaluated over six image processing applications on three Nvidia GPUs. A geometric mean speedup of up to 2.52 can be observed in our experiments.


FAU Authors / FAU Editors

Hannig, Frank PD Dr.-Ing.
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)
Qiao, Bo
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)
Reiche, Oliver
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)
Teich, Jürgen Prof. Dr.-Ing.
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)


How to cite

APA:
Qiao, B., Reiche, O., Hannig, F., & Teich, J. (2019). From Loop Fusion to Kernel Fusion: A Domain-specific Approach to Locality Optimization. In Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization (pp. 242-253). Washington DC, USA, US.

MLA:
Qiao, Bo, et al. "From Loop Fusion to Kernel Fusion: A Domain-specific Approach to Locality Optimization." Proceedings of the 2019 International Symposium on Code Generation and Optimization (CGO19), Washington DC, USA 2019. 242-253.

BibTeX: 

Last updated on 2019-10-03 at 12:38