js
js copied to clipboard
Support delete
With current algorithm it's impossible to Delete a key without leaving a Tombstone forever in each Redis replica.
From the client point of view, it's only safe to assume a key is deleted if all replica respond that the key is missing.
The only solution I can think of at the moment is to implement some kind of copying garbage collector that periodically move all non-deleted key to a new epoch.
It'd be very expensive to move all non-deleted keys. It looks like there is a cheaper way:
- Imagine that along with
accept.lua
there isacceptTombstone.lua
which works the same way but also puts a key into a localdeleting
queue. - Once a client wants to remove a value it puts a tombstone but instead of
accept.lua
it usesacceptTombstone.lua
since it works the same way consistency isn't affected - A GC process regularly:
- peeks a key from the
deleting
queue - executes identity transformation with the write quorum covering whole cluster
- pops the key from
deleting
queue and pushes intodeleted
queue on every acceptor
- peeks a key from the
- Another GC process regularly:
- peeks a key from the
deleted
queue - connects to each acceptor and wipes a corresponding tombstone
- pops the key from the
deleted
queue
- peeks a key from the
Once we know that a tombstone was written to all acceptors it's safe to remove it outside the paxos read/write loop since the state (all value are tombstones) becomes isomorphic to the initial state (all values are null).
During a membership change an operator stops both GC processes, does a migration, updates GC's config (addresses of acceptors and proposers) and restarts the processes.
in step #3 when you say write quorum do you mean all nodes or just majority ? If it's all nodes then it mean Tombstone can only be cleaned when all node are online.
This still seem like a good practical solution, thanks a lot for the feedback.