On-demand fault-tolerant loop processing on massively parallel processor arrays

Beitrag bei einer Tagung
(Konferenzbeitrag)


Details zur Publikation

Autor(en): Tanase AP, Witterauf M, Teich J, Hannig F, Lari V
Verlag: Institute of Electrical and Electronics Engineers Inc.
Jahr der Veröffentlichung: 2015
Tagungsband: In Proceedings of the 26th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP)
Seitenbereich: 194-201
ISBN: 9781479919246


Abstract


We present a compilation-based technique for providing on-demand structural redundancy for massively parallel processor arrays. Thereby, application programmers gain the capability to trade throughput for reliability according to application requirements. To protect parallel loop computations against errors, we propose to apply the well-known fault tolerance schemes dual modular redundancy (DMR) and triple modular redundancy (TMR) to a whole region of the processor array rather than individual processing elements. At the source code level, the compiler realizes these replication schemes with a program transformation that: (1) replicates a parallel loop program two or three times for DMR or TMR, respectively, and (2) introduces appropriate voting operations whose frequency and location may be chosen from three proposed variants. Which variant to choose depends, for example, on the error resilience needs of the application or the expected soft error rates. Finally, we explore the different tradeoffs of these variants in terms of performance overheads and error detection latency.


FAU-Autoren / FAU-Herausgeber

Hannig, Frank PD Dr.-Ing.
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)
Lari, Vahid
Sonderforschungsbereich/Transregio 89 Invasives Rechnen
Tanase, Alexandru-Petru Dr.-Ing.
Sonderforschungsbereich/Transregio 89 Invasives Rechnen
Teich, Jürgen Prof. Dr.-Ing.
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)
Witterauf, Michael
Lehrstuhl für Informatik 12 (Hardware-Software-Co-Design)


Zitierweisen

APA:
Tanase, A.-P., Witterauf, M., Teich, J., Hannig, F., & Lari, V. (2015). On-demand fault-tolerant loop processing on massively parallel processor arrays. In In Proceedings of the 26th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP) (pp. 194-201). Toronto, CA: Institute of Electrical and Electronics Engineers Inc..

MLA:
Tanase, Alexandru-Petru, et al. "On-demand fault-tolerant loop processing on massively parallel processor arrays." Proceedings of the 26th IEEE International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2015, Toronto Institute of Electrical and Electronics Engineers Inc., 2015. 194-201.

BibTeX: 

Zuletzt aktualisiert 2018-07-10 um 21:50