old-raft-rs icon indicating copy to clipboard operation
old-raft-rs copied to clipboard

Linearizability and Duplicate Requests

Open dzrw opened this issue 8 years ago • 6 comments

In chapter 6 on client interaction of the thesis, there is a paragraph regarding the requirements for achieving linearizability in Raft.

To achieve linearizability in Raft, servers must filter out duplicate requests. The basic idea is that servers save the results of client operations and use them to skip executing the same request multiple times. To implement this, each client is given a unique identifier, and clients assign unique serial numbers to every command. Each server’s state machine maintains a session for each client. The session tracks the latest serial number processed for the client, along with the associated response. If a server receives a command whose serial number has already been executed, it responds immediately without re-executing the request.

Given this filtering of duplicate requests, Raft provides linearizability.

Looking through the raft-rs implementation, I didn't find support for client request serial numbers or session management as described above. Is this intentional, or an oversight?

dzrw avatar Aug 26 '15 04:08 dzrw

A quick clarification: although the thesis indicates that this is the StateMachine's responsibility, it seems like raft-rs ought to provide a default session store. It seems like a pretty important requirement to get correct when using this library.

dzrw avatar Aug 26 '15 04:08 dzrw

Looking through the raft-rs implementation, I didn't find support for client request serial numbers or session management as described above. Is this intentional, or an oversight?

We currently have unique client IDs, but you are correct that we do not implement the linearizability feature described in the thesis. Additionally, it should be noted that (at this point) we allow stale reads, since the master immediately satisfies read requests without waiting for a commit (or commit equivalent).

danburkert avatar Aug 26 '15 05:08 danburkert

These serial numbers can be implemented by the client, and do not need to be part of actual Raft (simply encode your ID into your payload), so there's no technical limitation here. I agree that there's value in the library offering some functionality in that regard, but it's a non-trivial task to make that both useful and not wasting resources. Fundamentally, the client must specify the command ID: if Raft assigns it (think: a hash), how is it going to know whether a client re-sent the same proposal twice by accident or purpose? Raft may also not know who the actual client is (after all, you could have one "process" do the talking for thousands of clients and what a "client" is varies by application), so to avoid the state machine seeing those duplicate requests, you need to add a client ID as well. It's not too hard to go at this at a later stage.

tbg avatar Aug 26 '15 05:08 tbg

I filed #107 to track the issue I was referring to, since it's separate of this one (and should be higher priority).

danburkert avatar Aug 26 '15 05:08 danburkert

It's a non-trivial task to make that both useful and not wasting resources.

Agreed. It seems difficult to achieve, but thought it would be worthwhile to point out. I see that #107 is open; feel free to close this issue.

dzrw avatar Aug 29 '15 06:08 dzrw

It seems difficult to achieve, but thought it would be worthwhile to point out.

Absolutely, thanks for filing the issue! Preventing stale-reads is a necessary, but not sufficient to guarantee linearizability. As you point out, we also have to handle duplicate requests intelligently. I think we should leave this open, and revisit once we have a solution for #107. At that point we will probably have a better handle on what kind of overheads preventing duplicate requests would require, and whether it should be handled within raft-rs or within the application.

danburkert avatar Aug 29 '15 17:08 danburkert