Deterministic Fuzzy Checkpoints

Eischer M, Büttner M, Distler T (2019)


Publication Language: English

Publication Type: Conference contribution, Conference Contribution

Publication year: 2019

Pages Range: 153-162

Conference Proceedings Title: Proceedings of the 38th International Symposium on Reliable Distributed Systems (SRDS '19)

Event location: Lyon

URI: https://www4.cs.fau.de/Publications/2019/eischer_19_srds.pdf

DOI: 10.1109/SRDS47363.2019.00026

Abstract

Replicated systems tolerating arbitrary (Byzantine) faults require periodic and deterministic application-state checkpoints to perform essential tasks such as initializing new replicas, enabling faulty replicas to recover, and garbage-collecting old agreement-protocol messages. Existing techniques to create checkpoints in these systems make it necessary to temporarily suspend request execution in order to capture a consistent checkpoint, causing significant service disruptions for applications with large states. Unfortunately, state-of-the-art approaches from the domain of crash-tolerant systems also are not directly applicable, because the checkpoints they produce are not comparable across replicas and therefore cannot be validated in an environment in which replicas may fail arbitrarily and do not trust each other. In this paper, we address these problems by proposing deterministic fuzzy checkpoints (DFC), a novel technique that enables all correct replicas in a system to create consistent and matching checkpoints in parallel to processing requests. As a consequence, DFC increases service availability while still allowing replicas to verify the correctness of a checkpoint before applying it to their local states. In addition to our general approach, we present different alternatives to implement DFC within a replication library and furthermore discuss support for the creation of differential checkpoints. Experiments with a key-value store show that DFC is able to snapshot states of 3 GB while sustaining high performance throughout the entire checkpointing process.

Authors with CRIS profile

Related research project(s)

How to cite

APA:

Eischer, M., Büttner, M., & Distler, T. (2019). Deterministic Fuzzy Checkpoints. In Proceedings of the 38th International Symposium on Reliable Distributed Systems (SRDS '19) (pp. 153-162). Lyon.

MLA:

Eischer, Michael, Markus Büttner, and Tobias Distler. "Deterministic Fuzzy Checkpoints." Proceedings of the International Symposium on Reliable Distributed Systems (SRDS '19), Lyon 2019. 153-162.

BibTeX: Download