tidb icon indicating copy to clipboard operation
tidb copied to clipboard

store/copr, kv, planner, executor: support dispatch tasks to tiflash ReadNode

Open guo-shaoge opened this issue 2 years ago • 18 comments

Signed-off-by: guo-shaoge [email protected]

What problem does this PR solve?

Issue Number: close #34707

Problem Summary: Support dispatch copTask/batchCopTask/mppTask to tiflash_mpp/tiflash ReadNodes.

NOTE: This PR will not be merged into master, because it change senmatics of tidb_isolation_read_engines. Will open feature branch. But you are free to review this.

What is changed and how it works?

For now, there are three types of Task that will be sent to TiFlash. We need to consider all three code paths in two dimension:

  1. Dispatch: make sure task will only be dispatched to tiflash_mpp/ReadNode using consistent hashing.
  2. ErrorHandling: make sure tiflash_mpp store addr cache is updated when error occurs.

Function changes for each task type of each dimension is as following:

  1. copTask: 1.1. Dispatch: choose tiflash_mpp store when get grpcCtx(in client-go). 1.2. ErrorHandling: onSendFail() in client-go.
  2. batchCopTask 2.1. Dispatch: (*CopClient) sendBatch(). Add new storeType parameter. 2.2. ErrorHandling: (ss *RegionBatchRequestSender) onSendFailForBatchRegions
  3. mppTask 3.1. Dispatch: (c *MPPClient) ConstructMPPTasks. 3.2. ErrorHandling: dispatch mppTask failed and establish mppTask failed.

A new storeType kv.TiFlashMPP is added to help distinguish whether task will only be dispatch to ReadNodes or not.

Check List

Tests

  • [ ] Unit test
  • [x] Integration test: 1. test-infra E2E case is added: https://github.com/pingcap/endless/pull/657 2. sqllogic test: https://github.com/pingcap/schrodinger-test/pull/484
  • [ ] Manual test (add detailed scripts or steps below)
  • [ ] No code

Side effects

  • [ ] Performance regression: Consumes more CPU
  • [ ] Performance regression: Consumes more Memory
  • [ ] Breaking backward compatibility

Documentation

  • [x] Affects user behaviors: tiflash_mpp is permited for tidb_isolation_read_engines.
  • [ ] Contains syntax changes
  • [ ] Contains variable changes
  • [ ] Contains experimental features
  • [ ] Changes MySQL compatibility

Release note

Support dispatch tasks to tiflash ReadNode.

guo-shaoge avatar Mar 29 '22 03:03 guo-shaoge

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • XuHuaiyu
  • wshwsh12

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment. After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review. Reviewer can cancel approval by submitting a request changes review.

ti-chi-bot avatar Mar 29 '22 03:03 ti-chi-bot

Please follow PR Title Format:

  • pkg [, pkg2, pkg3]: what is changed

Or if the count of mainly changed packages are more than 3, use

  • *: what is changed

After you have format title, you can leave a comment /run-check_title to recheck it

sre-bot avatar Mar 29 '22 03:03 sre-bot

Code Coverage Details: https://codecov.io/github/pingcap/tidb/commit/186a7a8c04f76f2b02522c1994bc554247ac3352

sre-bot avatar Mar 29 '22 03:03 sre-bot

Please follow PR Title Format:

  • pkg [, pkg2, pkg3]: what is changed

Or if the count of mainly changed packages are more than 3, use

  • *: what is changed

After you have format title, you can leave a comment /run-check_title to recheck it

sre-bot avatar Mar 29 '22 06:03 sre-bot

Please follow PR Title Format:

  • pkg [, pkg2, pkg3]: what is changed

Or if the count of mainly changed packages are more than 3, use

  • *: what is changed

After you have format title, you can leave a comment /run-check_title to recheck it

sre-bot avatar Apr 27 '22 13:04 sre-bot

Please follow PR Title Format:

  • pkg [, pkg2, pkg3]: what is changed

Or if the count of mainly changed packages are more than 3, use

  • *: what is changed

After you have format title, you can leave a comment /run-check_title to recheck it

sre-bot avatar Apr 27 '22 13:04 sre-bot

Please follow PR Title Format:

  • pkg [, pkg2, pkg3]: what is changed

Or if the count of mainly changed packages are more than 3, use

  • *: what is changed

After you have format title, you can leave a comment /run-check_title to recheck it

sre-bot avatar Apr 28 '22 13:04 sre-bot

/hold

guo-shaoge avatar Apr 28 '22 13:04 guo-shaoge

Please follow PR Title Format:

  • pkg [, pkg2, pkg3]: what is changed

Or if the count of mainly changed packages are more than 3, use

  • *: what is changed

After you have format title, you can leave a comment /run-check_title to recheck it

sre-bot avatar May 16 '22 13:05 sre-bot

Please follow PR Title Format:

  • pkg [, pkg2, pkg3]: what is changed

Or if the count of mainly changed packages are more than 3, use

  • *: what is changed

After you have format title, you can leave a comment /run-check_title to recheck it

sre-bot avatar May 16 '22 13:05 sre-bot

Please follow PR Title Format:

  • pkg [, pkg2, pkg3]: what is changed

Or if the count of mainly changed packages are more than 3, use

  • *: what is changed

After you have format title, you can leave a comment /run-check_title to recheck it

sre-bot avatar May 17 '22 04:05 sre-bot

Please follow PR Title Format:

  • pkg [, pkg2, pkg3]: what is changed

Or if the count of mainly changed packages are more than 3, use

  • *: what is changed

After you have format title, you can leave a comment /run-check_title to recheck it

sre-bot avatar May 18 '22 03:05 sre-bot

This pr need a cluster to test, I will write a test-infra case.

guo-shaoge avatar May 18 '22 04:05 guo-shaoge

Please follow PR Title Format:

  • pkg [, pkg2, pkg3]: what is changed

Or if the count of mainly changed packages are more than 3, use

  • *: what is changed

After you have format title, you can leave a comment /run-check_title to recheck it

sre-bot avatar May 18 '22 04:05 sre-bot

/cc @XuHuaiyu

guo-shaoge avatar May 18 '22 04:05 guo-shaoge

/run-all-tests

guo-shaoge avatar May 23 '22 05:05 guo-shaoge

When to reload tiflashMPPStore if there is no grpc error? How to scale-out?

It's best to let etcd notify tidb to refresh store cache. But we can't arbitrarily add information to etcd. So we refresh cache every 30 seconds. It's only a temporary solution for now.

guo-shaoge avatar May 25 '22 07:05 guo-shaoge

/build

guo-shaoge avatar Jun 23 '22 08:06 guo-shaoge

/merge

guo-shaoge avatar Dec 08 '22 16:12 guo-shaoge

This pull request has been accepted and is ready to merge.

Commit hash: a50994d545a643ed9ff1eec31de06ff500688729

ti-chi-bot avatar Dec 08 '22 16:12 ti-chi-bot

TiDB MergeCI notify

🔴 Bad News! New failing [1] after this pr merged. These new failed integration tests seem to be caused by the current PR, please try to fix these new failed integration tests, thanks!

CI Name Result Duration Compare with Parent commit
idc-jenkins-ci/integration-cdc-test 🟥 failed 1, success 39, total 40 22 min New failing
idc-jenkins-ci-tidb/mybatis-test 🔴 failed 1, success 0, total 1 4 min 9 sec Existing failure
idc-jenkins-ci-tidb/integration-common-test 🟢 all 17 tests passed 12 min Existing passed
idc-jenkins-ci-tidb/common-test 🟢 all 11 tests passed 10 min Existing passed
idc-jenkins-ci-tidb/sqllogic-test-2 🟢 all 28 tests passed 6 min 41 sec Existing passed
idc-jenkins-ci-tidb/tics-test 🟢 all 1 tests passed 6 min 29 sec Existing passed
idc-jenkins-ci-tidb/integration-ddl-test 🟢 all 6 tests passed 6 min 4 sec Existing passed
idc-jenkins-ci-tidb/sqllogic-test-1 🟢 all 26 tests passed 5 min 56 sec Existing passed
idc-jenkins-ci-tidb/integration-compatibility-test 🟢 all 1 tests passed 3 min 20 sec Existing passed
idc-jenkins-ci-tidb/plugin-test 🟢 build success, plugin test success 4min Existing passed

sre-bot avatar Dec 08 '22 17:12 sre-bot