rocksplicator
rocksplicator copied to clipboard
Correctly pass in replication mode to the replication handler
Correctly pass in replication mode for replication mode replication handler to deal with.
Details
- The Config was not passed into the replication handler
- So the threads were waiting at Numberbox::wait for the repl to complete
- But as the repl-handler did know about the repl-mode, it did not do a post
- So all the waits timedout thereby increasing the latency ..
Tests
- Checked with various rep modes[0,1,2]
- Made sure it is reflected correctly on writes and in the replication handler
- No latency spikes
- Also ran Repl Consistency checker to verify
@newpoo - Updated the description with details of the issue
I see the bug now. I think that worker threads will always be blocked when change 0->1 or 0->2, and 1->2 change doesn't really enable 2-ack mode.
Can you add some tests for all possible replication mode change pattern?
We also need to remove the config parameter from handleReplicateRequest().
Why did we add dynamic per DB replication mode change in the first place? If we do need it, I think we should add a dedicated interface to allow rocksplicator client (such as rockstore) to change the per db replication mode. Passing it through every write() call doesn't seem to be optimal.
The reason we wanted to pass config to the write calls was, We want at some point allow the clients to choose what kind of durability they want per write .. Same for read from leader . Moreover, replication call, we lookup the db from the db map. For now that seems to be clean.. Let me know if you have any other thoughts
Posting a discussion from slack so we don't lose the context:
With replication mode mutated on a per-request/write basis, it seems we can easily run into race conditions like:
- writeA with mode 2, then immediately followed by writeB with mode 0
- replicatorDB update its local mode to be 0
- replicatorDB receives a
handleReplicateRequest
from a follower, since the current mode is 0, it won’t post themax_seq_no_acked_
- writeA now times out
Alos, do we want to support dynamically setting replication mode (0/1/2) on a pre-request/write basis? What's the use case? I can imagine we do that on a per-resource/dataset level (based on dataset config) if needed, but probably shouldn't change that during runtime.