Experiences with software-based soft-error mitigation using AN codes

Hoffmann M, Ulbrich P, Dietrich C, Schirmeier H, Lohmann D, Schröder-Preikschat W (2016)


Publication Status: Published

Publication Type: Book chapter / Article in edited volumes

Publication year: 2016

Journal

Publisher: Springer Verlag (Germany)

Edited Volumes: Software Quality Journal

Book Volume: 24

Pages Range: 87-113

Journal Issue: 1

DOI: 10.1007/s11219-014-9260-4

Abstract

Arithmetic error coding schemes are a well-known and effective technique for soft-error mitigation. Although the underlying coding theory is generally a complex area of mathematics, its practical implementation is comparatively simple in general. However, compliance with the theory can be lost easily while moving toward an actual implementation, which finally jeopardizes the aspired fault-tolerance characteristics and effectiveness. In this paper, we present our experiences and lessons learned from implementing arithmetic error coding schemes (AN codes) in the context of our Combined Redundancy fault-tolerance approach. We focus on the challenges and pitfalls in the transition from maths to machine code for a binary computer from a systems perspective. Our results show that practical misconceptions (such as the use of prime numbers) and architecture-dependent implementation glitches occur at every stage of this transition. We identify typical pitfalls and describe practical measures to find and resolve them. This allowed us to eliminate all remaining silent data corruptions in the Combined Redundancy framework, which we validated by an extensive fault-injection campaign covering the entire fault space of 1-bit and 2-bit errors.

Authors with CRIS profile

Related research project(s)

Involved external institutions

How to cite

APA:

Hoffmann, M., Ulbrich, P., Dietrich, C., Schirmeier, H., Lohmann, D., & Schröder-Preikschat, W. (2016). Experiences with software-based soft-error mitigation using AN codes. In Software Quality Journal. (pp. 87-113). Springer Verlag (Germany).

MLA:

Hoffmann, Martin, et al. "Experiences with software-based soft-error mitigation using AN codes." Software Quality Journal. Springer Verlag (Germany), 2016. 87-113.

BibTeX: Download