Conference contribution
(Conference Contribution)


On-demand fault-tolerant loop processing on massively parallel processor arrays


Publication Details
Author(s): Tanase AP, Witterauf M, Teich J, Hannig F, Lari V
Publisher: Institute of Electrical and Electronics Engineers Inc.
Publication year: 2015
Conference Proceedings Title: In Proceedings of the 26th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP)
Pages range: 194-201
ISBN: 9781479919246

Event details
Event: 26th IEEE International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2015
Event location: Toronto
Start date of the event: 27/07/2015
End date of the event: 29/07/2015

Abstract

We present a compilation-based technique for providing on-demand structural redundancy for massively parallel processor arrays. Thereby, application programmers gain the capability to trade throughput for reliability according to application requirements. To protect parallel loop computations against errors, we propose to apply the well-known fault tolerance schemes dual modular redundancy (DMR) and triple modular redundancy (TMR) to a whole region of the processor array rather than individual processing elements. At the source code level, the compiler realizes these replication schemes with a program transformation that: (1) replicates a parallel loop program two or three times for DMR or TMR, respectively, and (2) introduces appropriate voting operations whose frequency and location may be chosen from three proposed variants. Which variant to choose depends, for example, on the error resilience needs of the application or the expected soft error rates. Finally, we explore the different tradeoffs of these variants in terms of performance overheads and error detection latency.



How to cite
APA: Tanase, A.-P., Witterauf, M., Teich, J., Hannig, F., & Lari, V. (2015). On-demand fault-tolerant loop processing on massively parallel processor arrays. In In Proceedings of the 26th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP) (pp. 194-201). Institute of Electrical and Electronics Engineers Inc..

MLA: Tanase, Alexandru-Petru, et al. "On-demand fault-tolerant loop processing on massively parallel processor arrays." Proceedings of the 26th IEEE International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2015, Toronto Institute of Electrical and Electronics Engineers Inc., 2015. 194-201.

BibTeX: Download
Share link
Last updated on 2017-03-26 at 03:44
PDF downloaded successfully