Parallelization Approaches for Hardware Accelerators - Loop Unrolling versus Loop Partitioning

Hannig F, Dutta H, Teich J (2009)


Publication Type: Conference contribution

Publication year: 2009

Journal

Publisher: Springer-verlag

Edited Volumes: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Series: Lecture Notes in Computer Science (LNCS)

Book Volume: 5455

Pages Range: 16-27

Conference Proceedings Title: Proceedings of the 22nd International Conference on Architecture of Computing Systems

Event location: Delft NL

ISBN: 978-3-642-00453-7

DOI: 10.1007/978-3-642-00454-4_5

Abstract

State-of-the-art behavioral synthesis tools barely have high-level transformations in order to achieve highly parallelized implementations. If any, they apply loop unrolling to obtain a higher throughput. In this paper, we employ the PARO behavioral synthesis tool which has the unique ability to perform both loop unrolling or loop partitioning. Loop unrolling replicates the loop kernel and exposes the parallelism for hardware implementation, whereas partitioning tiles the loop program onto a regular array consisting of tightly coupled processing elements. The usage of the same design tool for both the variants enables for the first time, a quantitative evaluation of the two approaches for reconfigurable architectures with help of computationally intensive algorithms selected from different benchmarks. Superlinear speedups in terms of throughput are accomplished for the processor array approach. In addition, area and power cost are reduced. © 2009 Springer Berlin Heidelberg.

Authors with CRIS profile

Related research project(s)

How to cite

APA:

Hannig, F., Dutta, H., & Teich, J. (2009). Parallelization Approaches for Hardware Accelerators - Loop Unrolling versus Loop Partitioning. In Proceedings of the 22nd International Conference on Architecture of Computing Systems (pp. 16-27). Delft, NL: Springer-verlag.

MLA:

Hannig, Frank, Hritam Dutta, and Jürgen Teich. "Parallelization Approaches for Hardware Accelerators - Loop Unrolling versus Loop Partitioning." Proceedings of the 22nd International Conference on Architecture of Computing Systems (ARCS), Delft Springer-verlag, 2009. 16-27.

BibTeX: Download