Computer Science
Resilient distributed causal memory in client-server model
Abstract
We study distributed causal shared memory (or key-value pairs) in an asynchronous network under crash failures. Causal memory, introduced by Ahamad et al. in the context of multi-processor environment in 1994, is an abstraction which ensures that nodes agree on the relative ordering of read and write operations that are causally related on key-value pairs. Inspired by the recent interests in geo-replicated causal storage systems (e.g., COPS, Eiger, Bolt-on), we systematically study the fault-tolerance property of the causal shared memory in the client-server model in this work. We identify that 2f + 1 servers is both necessary and sufficient to build a resilient causal memory in the presence of up to f crashed servers. We provide both the necessity proof and a new optimal algorithm that matches the bound. For evaluation, we implement our algorithm in Golang and compare the performance with state-of-the-art fault-tolerant algorithms that ensure strong consistency in the Google Cloud Platform.