kvrocks icon indicating copy to clipboard operation
kvrocks copied to clipboard

Possible split brain on migration race condition

Open zhixinwen opened this issue 3 months ago • 7 comments

Search before asking

  • [x] I had searched in the issues and found no similar issues.

Version

https://github.com/apache/kvrocks/blob/9c9ffb82de7fd70e02ae0244faf05677d27f53b0/src/cluster/slot_migrate.cc#L424-L445 In finishSuccessfulMigration, the code first set SUCCESS on dest node then the local cluster marks the slot range on itself as migrated.

However, if the dest node has some network issue and we have an error for setImportStatusOnDstNode, then it is possible the dest node actually successfully processed setImportStatusOnDstNode and put the slot in imported_slots_.However, the local cluster does not mark the slot as migrated.

In this case, the old node would be able to serve read. And the new node would be able to server read and write: https://github.com/apache/kvrocks/blob/9c9ffb82de7fd70e02ae0244faf05677d27f53b0/src/cluster/cluster.cc#L926-L930

And if clients are sending traffic to the two nodes at the same time (e.g. because some clients have old slot mapping), then it is possible the clients hitting old node may see stale value. In the worst situation, if controller/operator sets the node slots to old value, all writes to the dest node will be lost?

In https://github.com/apache/kvrocks/blob/9c9ffb82de7fd70e02ae0244faf05677d27f53b0/src/cluster/cluster.cc#L926-L930, should we only return OK if the connection is in ASKING mode?

Minimal reproduce step

N/A

What did you expect to see?

N/A

What did you see instead?

N/A

Anything Else?

No response

Are you willing to submit a PR?

  • [x] I'm willing to submit a PR!

zhixinwen avatar Aug 27 '25 23:08 zhixinwen

@zhixinwen Thanks for your report.

should we only return OK if the connection is in ASKING mode?

And yes, I think it makes sense to allow the destination node to serve the slot only if the source node redirects the request.

git-hulk avatar Aug 28 '25 14:08 git-hulk

@git-hulk Thanks for your response.

What would happen when master becomes slave after master calls setImportStatusOnDstNode(*dst_fd_, kImportSuccess)? To me, looks like the new master would have none as migration_state_, and the controller would keep the old topology. However, the dest node would still have imported_slots_ set and it could potentially receive some writes MOVED from old source node. In this case, some writes would still be lost?

zhixinwen avatar Aug 28 '25 18:08 zhixinwen

Let's take a step back, do we really need imported_slots_?

SetClusterNodes will clear imported_slots_ anyway. It is possible that a master-slave failover happened in the cluster for some other nodes and it triggered controller to call SetClusterNodes. migrationLoop has not update the migration status yet. In this case, imported_slots_ would be cleared even if it should not anyway.

zhixinwen avatar Aug 28 '25 20:08 zhixinwen

However, the dest node would still have imported_slots_ set and it could potentially receive some writes MOVED from old source node. In this case, some writes would still be lost?

For this case, it will fail to write instead of losing data. Because the source node will redirect the write request to the destination node, but the cluster information hasn't been updated yet.

Let's take a step back, do we really need imported_slots_?

I think yes. There is a time window between updating the cluster slots and finishing the migration status. The destination node cannot serve the imported slot if it doesn't have the imported status.

git-hulk avatar Aug 29 '25 02:08 git-hulk

Because the source node will redirect the write request to the destination node

Agree

but the cluster information hasn't been updated yet.

I wonder if there would be any case the cluster info will never be updated and the old topology is kept. Master that is being migrated becomes slave can be such a case.

If it is possible, then some writes may be redirected to a node that would not own the slot and we have write loss? I feel the issue is before controller decides a node owns a slot, the node does not really own it. Therefore any write of the slot to the new node can potentially cause issue.

zhixinwen avatar Aug 29 '25 05:08 zhixinwen

I wonder if there would be any case the cluster info will never be updated and the old topology is kept. Master that is being migrated becomes slave can be such a case.

I think yes in this case, the new writes will be lost if the source node promoted a new replica.

git-hulk avatar Aug 29 '25 05:08 git-hulk

Thanks for confirming. I will look deeper into it and see what we can do. I feel we would need to block both source and dst node until controller confirms the topology change. I will need to run some experiment to see how bad it would affect the migration process.

zhixinwen avatar Aug 29 '25 05:08 zhixinwen