maelstrom icon indicating copy to clipboard operation
maelstrom copied to clipboard

seq-kv reads never return final state for most recent write/cas

Open CameronAavik opened this issue 2 years ago • 5 comments

I'm doing the gossip glomers g-counter challenge at the moment in which you implement g-counter using seq-kv. Below is a spoiler of my solution, so just a warning for anyone who doesn't wish to be spoiled.

The solution I went for is that each node maintains its own sum under its own key in seq-kv, that means that if we have 3 nodes, then we will have 3 keys in seq-kv each storing the sum of its own node. On a background thread, each node repeatedly issues read requests for all 3 keys in seq-kv, and so my thinking was that after enough reads they should eventually get the latest value. This approach is also mentioned as one that should work in the other issue here.

What I am observing is that the most recent write or cas operation that occurs against any key in the seq-kv will never be read by other nodes in the system. I have attached at the bottom of the issue a log of the maelstrom output with --log-net-send turned on to show proof of this happening. If there is a better format to debug this data that you want me to upload, let me know. In this log I ran the g-counter workload against 2 nodes with a rate of 20 and a time limit of 1. The most recent cas/write message that occurs is: {:id 90, :src "n0", :dest "seq-kv", :body {:type "cas", :msg_id 28, :key "n0", :from 10, :to 14}}. After this, n1 sends a read for key "n0" 19 times, all of which return a value of 10 despite the fact that the cas sent from n0 changed it to 14.

I know according to the definition of sequential consistency that this is an allowed outcome, but I would have expected that eventually it would return the latest value. Sending noop cas's (e.g. from current value to current value) from all the nodes doesn't seem to fix this either. The solution I ended up getting to work was to set up another background thread on each node that just writes a random number to an unrelated key in seq-kv which seems to cause the reads to eventually return the latest values for each node.

maelstrom-seq-kv-no-progress.txt

CameronAavik avatar Feb 27 '23 22:02 CameronAavik

This may actually be a bug (er, sort of, but it's certainly not consistent with lww-kv) in seq-kv! I was sprinting to write these challenges and didn't have time to fully go through and test this one. Give me a little bit and I'll see if I can get that fixed for you.

aphyr avatar Feb 27 '23 22:02 aphyr

HA! There was a bug of sorts. seq-kv was doing something legal but v frustrating where reads and other re-orderable operations would converge on the next to latest state, but would never consider the latest state unless forced to. Version 0.2.3 should converge if you just spam it with requests. This affects #39 too.

aphyr avatar Feb 27 '23 23:02 aphyr

@aphyr, is it correct to say that if we want all the nodes to converge to the same value, using a sequential store gets there probabilistically, and a linear store deterministically? I'm wondering a bit what the pedagogy in challenge 4 is - certainly I've realised that sequential consistency is much looser than you'd think at first!

wolverian avatar Feb 28 '23 14:02 wolverian

Not exactly: there's both a probabilistic and a deterministic way to use seq-kv here. My thinking in designing this challenge was that folks would have this exact kind of "aha!" moment realizing that sequentially consistent stores didn't guarantee recency, and that they'd have an easy-but-inelegant way (doing lots of retries, as shown here) to work around that staleness. Because it's probabilistic and feels a bit inelegant, I was hoping it might prompt people to ask "wait, is there a more elegant, deterministic way I could do this?"--and that might lead them to discover the trick, and in the process gain (or show off) a deeper intuition about sequential consistency.

aphyr avatar Feb 28 '23 14:02 aphyr

@aphyr Is the "trick" in question the one that you mention in this comment https://github.com/jepsen-io/maelstrom/issues/39#issuecomment-1445414521?

Because later in the same issue thread, someone asked if this "trick" was a property of sequential consistency (https://github.com/jepsen-io/maelstrom/issues/39#issuecomment-1476731878), and your response seemed to indicate that this is not a property of sequential consistency, but a property of your particular implementation of data-store.

If this is the case, I definitely think you should clarify the challenge description to make it clear that the data-store has some additional properties beyond sequential consistency.

LouisGariepy avatar Apr 06 '23 01:04 LouisGariepy