foundationdb icon indicating copy to clipboard operation
foundationdb copied to clipboard

Add a router tag for all mutations after enabling HA [release-7.3]

Open jzhou77 opened this issue 2 years ago • 5 comments

We found a data corruption bug when switching from a single region to two regions, i.e., re-enabling HA. The exact sequence of corruption for the test is:

  1. Epoch 6: 2 regions
  2. Epoch 8: change usable_region to 1, this configuration change txn commits at version V.
  3. Epoch 10: usable_region is now 2, recoverAt is V. and V is copied to the newly recruited tlogs. Remote SS peeked the tlog, but didn't persist the V's data yet.
  4. Restart. Epoch 12, another recovery
  5. Epoch 14: remote tlog starts at Unrecovered < V , actually Unrecovered == Epoch 8's endVersion. So pullAsyncData is pulling with log router tag, and mutations at V (in Epoch 10's tlogs) don't have router tags.

To reproduce: -f ./tests/restarting/from_7.3.0/ConfigureTestRestart-1.toml -b on -s 1855375089 -f ./tests/restarting/from_7.3.0/ConfigureTestRestart-2.toml -b on -s 1855375090 --restarting commit d24a62cc7, clang build

So the problem is with the tlog data ([PreviousEpochEndVersion, RecoveryVersion]) copied from epoch 8 to 10 can be lost, because they don't have a log router tag. So at epoch 14, when pullAsyncData() of the new remote tlog tries to copy the data from version V, they are not copied.

The solution is to add a static router tag (-2, 0) to all mutations after HA is enabled. Since this transaction will trigger a recovery, the next epoch has log router tags, so the copied range will have the proper tag to be pulled from remote side, i.e., log routers. For the tlog data not copied from the old epoch when usable_region is 1, remote side storage servers will directly peek from old tlogs (because of usable_region=1).

The problem with this solution is that [PreviousEpochEndVersion, ConfigChangeVersion) still does not have router tags, thus are still missing on the remote SSes. A potential solution is that: when backup_worker_enabled:=1 configuration is used, every mutation will have a router tag even if usable_regions=1. Another more heavy-weight solution is DD should quickly drop remote SSes when switching from 2 regions to 1 and let DD to rereplicate data for the second switch from 1 region to 2 regions.

100k 20231116-173613-jzhou-c056a50152d59d2f 100k from_7.3.0/ConfigureTestRestart* 20231116-180821-jzhou-e0fce6db35f04567

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • [ ] The PR has a description, explaining both the problem and the solution.
  • [ ] The description mentions which forms of testing were done and the testing seems reasonable.
  • [ ] Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • [ ] This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
  • [ ] There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

jzhou77 avatar Nov 16 '23 05:11 jzhou77

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: b9584eedd80c023f4281fae521ef2135e555b35a
  • Duration 0:07:23
  • Result: :x: FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Nov 16 '23 06:11 foundationdb-ci

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: b9584eedd80c023f4281fae521ef2135e555b35a
  • Duration 0:07:47
  • Result: :x: FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Nov 16 '23 06:11 foundationdb-ci

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: b9584eedd80c023f4281fae521ef2135e555b35a
  • Duration 0:07:50
  • Result: :x: FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

foundationdb-ci avatar Nov 16 '23 06:11 foundationdb-ci

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: b9584eedd80c023f4281fae521ef2135e555b35a
  • Duration 0:31:24
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Nov 16 '23 06:11 foundationdb-ci

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: b9584eedd80c023f4281fae521ef2135e555b35a
  • Duration 0:46:28
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Nov 16 '23 06:11 foundationdb-ci