CrossCheck: A Holistic Approach for Tolerating Crash-Faults and Arbitrary Failures

Martens A, Borchert C, Nieke M, Spinczyk O, Kapitza R (2016)


Publication Type: Conference contribution

Publication year: 2016

Publisher: Institute of Electrical and Electronics Engineers Inc.

Pages Range: 65-76

Conference Proceedings Title: Proceedings - 2016 12th European Dependable Computing Conference, EDCC 2016

Event location: Gothenburg, SWE

ISBN: 9781509015825

DOI: 10.1109/EDCC.2016.29

Abstract

High availability is no longer optional since more and more Internet-based services provide economical or otherwise critical offerings. Traditionally, crash faults are addressed using state-machine replication (SMR) and critical data is selectively protected by checksums. Both techniques can be efficiently combined, however, large parts of a service remain susceptible to transient errors such as bit-flips or more severe state corruptions. To address this weakness and also to reduce the labouring and non-trivial effort of identifying and selectively hardening a complex service, we propose CrossCheck - A holistic approach. CrossCheck extends the crash-fault protection of SMR to also provide tolerance against arbitrary state corruptions, thereby especially addressing multithreaded applications. This is achieved by a fine-grained state comparison and a precise recovery mechanism using fault-free replicas. The implementation utilizes aspectoriented programming and therefore requires only minimal manual changes to the underlying software. In our evaluation, we show that a multithreaded key-value store can be made resilient to crashes and hardened against arbitrary state corruptions with moderate overhead.

Authors with CRIS profile

Involved external institutions

How to cite

APA:

Martens, A., Borchert, C., Nieke, M., Spinczyk, O., & Kapitza, R. (2016). CrossCheck: A Holistic Approach for Tolerating Crash-Faults and Arbitrary Failures. In Proceedings - 2016 12th European Dependable Computing Conference, EDCC 2016 (pp. 65-76). Gothenburg, SWE: Institute of Electrical and Electronics Engineers Inc..

MLA:

Martens, Arthur, et al. "CrossCheck: A Holistic Approach for Tolerating Crash-Faults and Arbitrary Failures." Proceedings of the 12th European Dependable Computing Conference, EDCC 2016, Gothenburg, SWE Institute of Electrical and Electronics Engineers Inc., 2016. 65-76.

BibTeX: Download