foundationdb icon indicating copy to clipboard operation
foundationdb copied to clipboard

Optimization: Optimistically tag and serialize mutations pre-resolution.

Open sfc-gh-dadkins opened this issue 3 years ago • 3 comments

Tagging and serializing mutations currently happens post-resolution, after a commit version has been assigned. This can result in head-of-line blocking by large transactions, as they must complete this CPU-intensive process prior to evaluating any later transactions.

This optimization attempts to tag and serialize mutations pre-resolution, prior to getting a commit version, on the assumption that the key->tag mapping changes infrequently.

We add a preResolution tag to assignMutationsToStorageServers so that function can be called before resolution. If we detect that any metadata updates have taken place between the time we tagged the mutations, and the time we would have tagged them post-resolution, then we throw away the tagged and serialized mutations and regenerate them.

These are the current set of conditions which invalidate this optimization:

  • any metadata mutation, whether it went through this commit proxy, or we found out about it from the resolver

  • any transaction in the batch doesn't commit; we have no way to disentangle individual transactions' mutations once serialized

  • resolver private mutations are enabled; these will assert that they are the first mutations added to toCommit, which is not possible if we've already pre-tagged the mutations

  • encryption is enabled; cipher keys aren't available until getResolution so we can't serialize anything beforehand

  • any versionstamp correction; these mutations are modified prior to sending them to resolver in a way that they no longer look like metadata updates afterwards.

Replace this text with your description here...

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • [ ] The PR has a description, explaining both the problem and the solution.
  • [ ] The description mentions which forms of testing were done and the testing seems reasonable.
  • [ ] Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • [ ] This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
  • [ ] There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

sfc-gh-dadkins avatar Aug 24 '22 17:08 sfc-gh-dadkins

Doxense CI Report for Windows 10

  • Commit ID: efae771092c944b8808bba67aad29ba3dd78d5d2
  • Result: :heavy_check_mark: SUCCEEDED
  • Build Logs (available for 30 days)

fdb-windows-ci avatar Aug 25 '22 22:08 fdb-windows-ci

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: efae771092c944b8808bba67aad29ba3dd78d5d2
  • Duration 1:06:19
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

foundationdb-ci avatar Aug 25 '22 23:08 foundationdb-ci

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: efae771092c944b8808bba67aad29ba3dd78d5d2
  • Duration 3:40:00
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

foundationdb-ci avatar Aug 26 '22 01:08 foundationdb-ci