doris icon indicating copy to clipboard operation
doris copied to clipboard

[Optimize](Random distribution) Improve the performance of tablet sin…

Open eldenmoon opened this issue 1 year ago • 12 comments

…k and delta writer of writing blocks

The current distribution model for Doris is as follows:

OlapTableSink seperate the original Block into serveral subblocks of each node(BE) by tablets distribution and distributes subblocks to storage engine of backends, then the storage engine will seperate the subblock into multiple tablets channel and each delta writer will handle partial of the block.

This model causes blocks to be split according to tablets, and the splitting process can be a relatively heavy operation. After splitting, the blocks are distributed to different DeltaWriters (Memtables) through RPCs to TabletChannels. The distribution operation on TabletChannels is also a relatively heavy operation. If the distribution property of the table is RANDOM distribution, then we have the opportunity to distribute the blocks according to the complete block during distribution. The advantage of doing so is to reduce memory copying and improve write locality, similar to appending the entire block to the memtable.

This optimze could save 10% ~ 20% CPU cost of RANDOM distribution table load when enable load_to_single_tablet Whats more, is that we could even write to the local delta writer from OlapTableSink in the single_replica_load mode

Proposed changes

Issue Number: close #17388

Problem summary

Describe your changes.

Checklist(Required)

  • [ ] Does it affect the original behavior
  • [ ] Has unit tests been added
  • [ ] Has document been added or modified
  • [ ] Does it need to update dependencies
  • [ ] Is this PR support rollback (If NO, please explain WHY)

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

eldenmoon avatar Mar 03 '23 07:03 eldenmoon

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Mar 03 '23 07:03 github-actions[bot]

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Mar 03 '23 07:03 github-actions[bot]

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Mar 03 '23 07:03 github-actions[bot]

run p0

eldenmoon avatar Mar 03 '23 08:03 eldenmoon

run buildall

eldenmoon avatar Mar 03 '23 13:03 eldenmoon

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Mar 03 '23 13:03 github-actions[bot]

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Mar 03 '23 13:03 github-actions[bot]

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Mar 03 '23 13:03 github-actions[bot]

TeamCity pipeline, clickbench performance test result: the sum of best hot time: 33.53 seconds stream load tsv: 460 seconds loaded 74807831229 Bytes, about 155 MB/s stream load json: 40 seconds loaded 2358488459 Bytes, about 56 MB/s stream load orc: 74 seconds loaded 1101869774 Bytes, about 14 MB/s stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230306101319_clickbench_pr_109289.html

hello-stephen avatar Mar 03 '23 15:03 hello-stephen

run buildall

eldenmoon avatar Mar 03 '23 19:03 eldenmoon

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Mar 03 '23 19:03 github-actions[bot]

run buildall

eldenmoon avatar Mar 04 '23 01:03 eldenmoon

run buildall

eldenmoon avatar Mar 06 '23 03:03 eldenmoon

run buildall

eldenmoon avatar Mar 06 '23 03:03 eldenmoon

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Mar 06 '23 03:03 github-actions[bot]

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Mar 06 '23 03:03 github-actions[bot]

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Mar 06 '23 05:03 github-actions[bot]

run buildall

eldenmoon avatar Mar 06 '23 06:03 eldenmoon

@caiconghui Sorry to disturb, any suggestion on this PR ?

eldenmoon avatar Mar 06 '23 06:03 eldenmoon

run buildall

eldenmoon avatar Mar 06 '23 09:03 eldenmoon

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Mar 06 '23 09:03 github-actions[bot]

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Mar 06 '23 09:03 github-actions[bot]

run beut

eldenmoon avatar Mar 07 '23 02:03 eldenmoon

run feut

eldenmoon avatar Mar 07 '23 02:03 eldenmoon

PR approved by anyone and no changes requested.

github-actions[bot] avatar Mar 10 '23 02:03 github-actions[bot]