raft-engine icon indicating copy to clipboard operation
raft-engine copied to clipboard

Introduce double write file system

Open Connor1996 opened this issue 1 year ago • 3 comments

This PR introduces a hedged file system to double-write every io operation. The HedgedFileSystem manages two directories on different cloud disks. All operations of the interface are serialized by one channel for each disk and wait until either one of the channels is consumed. With that, if one of the disk's io is slow for a long time, the other can still serve the operations without any delay. And once the disk comes back to normal, it can catch up with the accumulated operations record in the channel. Then the states of the two disks can be synced again.

Close https://github.com/tikv/raft-engine/issues/342

Connor1996 avatar Jul 07 '23 02:07 Connor1996

My overall impression of this PR is that we are leveraging certain undocumented invariants, e.g. sole append-only writer plus one reader instance per thread, in the upper layers to ensure the safety of concurrency in the lower-level HedgedFileSystem. Two drawbacks I can think of

  • There are many variants in the FileSystem layer, for example the read-write conflicts, in-atomicity of operations, etc. It's complicated to cover all the cases, and it make the code hard to be understood.
  • In the future, if any engineer is unaware of these hidden relationships, it could easily lead to significant disasters.

Please correct me if my understanding is wrong.

coderplay avatar Aug 25 '23 16:08 coderplay

In a scenario where a fast disk wrote transactions 1, 2, and 3, while the slow disk only recorded transaction 1. If the fast disk is lost somehow, and only transaction 1 remaining on the slow disk, data consistency problems may arise. Right?

coderplay avatar Sep 08 '23 19:09 coderplay

In a scenario where a fast disk wrote transactions 1, 2, and 3, while the slow disk only recorded transaction 1. If the fast disk is lost somehow, and only transaction 1 remaining on the slow disk, data consistency problems may arise. Right?

It doesn't improve the data durability, when either one of disks is lost or corrupted, raft engine should refuse to start. So, it never has the chance to expose data inconsistency.

Connor1996 avatar Sep 11 '23 03:09 Connor1996