add diskless-fallback option for repl-diskless-load configuration
Description
When a Replica with the default value of repl-diskless-load ("disabled") performs a full sync with the master, it first syncs the RDB data to disk. Only after completing this step does the Replica load the data into memory.
Disconnections during disk-based sync result sometimes in infinite sync loops and outdated data on the replica.
The longer the primary and replica are disconnected and out of sync, the more stale the data that is on the replica becomes, and the less beneficial it becomes to do disk-based sync as time progresses.
A common example is short writes (insufficient disk space) on the replica - if a replica doesn't have enough disk space for the RDB it will fail repeatedly.
This solution will also address another issue - Replica Client Output Buffer overrun on the master side. In many cases, high traffic persists during subsequent sync attempts, leading to repeating COB overruns. Under the assumption that diskless syncs are faster, this will reduce the risk of exceeding the buffer size limit.
Code Changes
server.h and server.c
added the new fields -
-
sync_aborts_total- This field is incremented when we recognize that a disconnection happened during a sync attempt, for information purpose only. -
last_sync_abortedsame as previous, but it's reset after a successful sync. we use it when choosing a new sync strategy. -
replica_last_sync_typeindicates which strategy used in the last sync.
replication.c
- In the existing code, when a sync fails,
cancelReplicationHandshakeis called , which then callsreplicationAbortSyncTransfer.last_sync_abortedis set to true, andsync_aborts_totalis increased. - function
shouldFallbackToDisklessLoadadded in order to implement the fallback duringuseDisklessLoad- the logic for replica choice of sync type. - The Replica sends newline to primary node once when flushing it's data, and every time the RDB load progress. If the primary loses connection to the replica, the state will change is changed to CONN_STATE_ERROR.
added a check of connection state after the RDB load finish successfully, and before the load marked as finish by server.repl_state = REPL_STATE_CONNECTED and connection state changes. after that, sync is marked as successful and last_sync_aborted is reset.
Testing
Added a new test in tests/unit/cluster/diskless-sync-fallback.tcl which checks just the fallback logic; If a disk-based sync has failed, then the next sync attempt should be diskless (RDB received straight to parser).
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 70.21%. Comparing base (
71dd85d) to head (67dfbc0). Report is 29 commits behind head on unstable.
:exclamation: Current head 67dfbc0 differs from pull request most recent head adf4fa4
Please upload reports for the commit adf4fa4 to get more accurate results.
Additional details and impacted files
@@ Coverage Diff @@
## unstable #601 +/- ##
============================================
+ Coverage 70.09% 70.21% +0.11%
============================================
Files 110 110
Lines 60041 60085 +44
============================================
+ Hits 42085 42187 +102
+ Misses 17956 17898 -58
| Files | Coverage Δ | |
|---|---|---|
| src/config.c | 78.06% <ø> (-0.26%) |
:arrow_down: |
| src/replication.c | 87.44% <100.00%> (+0.34%) |
:arrow_up: |
| src/server.c | 88.58% <100.00%> (-0.35%) |
:arrow_down: |
| src/server.h | 100.00% <ø> (ø) |
I do not think the top comment explains the main issue we are trying to solve here. disk based load on the replica side is meant in order to help the replica keep on serving reads and avoid bigger data loss when sync is in progress. fallback to diskless is mostly helpful in cases were we have a disk related issue (eg. short writes) and not in order to bridge COB overrun issues which are better handled by different solutions (which we currently lack some, like throttling on COB growth). I would suggest to rewrite the top comment.
@ranshid Do we still want to push this?