starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Enhancement] reduce read io requests during spill restore phase

Open silverbullet233 opened this issue 1 year ago • 3 comments

Why I'm doing:

Currently, spill needs to read chunks one by one from the Block during the restore phase. BlockReader has no cache, so an IO request is triggered every time it is read, which is very inefficient on remote storage.

What I'm doing:

I support the buffer read function for BlockReader to reduce the number of io requests.

main changes:

  1. introduce two session variables enable_spill_buffer_read and max_spill_read_buffer_bytes_per_driver. enable_spill_buffer_read is used to control whether buffer read is enabled, and max_spill_read_buffer_bytes_per_driver is used to control the size of buffer data in a single operator.
  2. merge the implementation of File/LogBlockReader::read_fully into the base class to reduce redundant code, and support buffer read on this basis

test result

I tested several more complex queries in tpcds-1t. under force spill mode, all data spill to oss, the number of io requests was significantly reduced, and the time of queries was also significantly reduced.

Query Time(ms) IO requests
all spill to oss all spill to local disk
enable buffer read disable buffer read speedup enable buffer read disable buffer read speedup enable buffer read disable buffer read Reduce ratio
QUERY04 190305 217614 114.35% 192133 208756 108.65% 4649 100422 95.37%
QUERY11 122376 142772 116.67% 110633 117879 106.55% 3903 80666 95.16%
QUERY23-1 267710 600509 224.31% 239933 260251 108.47% 40876 673530 93.93%
QUERY23-2 267012 566536 212.18% 238909 262279 109.78% 41406 690290 94.00%
QUERY51 34372 115126 334.94% 21126 22831 108.07% 2177 290784 99.25%
QUERY64 99432 493314 496.13% 91407 93526 102.32% 9784 709848 98.62%
QUERY65 59704 208039 348.45% 37864 40907 108.04% 4570 416542 98.90%
QUERY67 322700 490388 151.96% 280088 313212 111.83% 53923 309864 82.60%
QUERY72 35089 53000 151.04% 22054 25537 115.79% 256 50404 99.49%
QUERY75 66099 152452 230.64% 55740 55817 100.14% 2958 209492 98.59%
QUERY97 43632 257600 590.39% 25714 25319 98.46% 1664 651470 99.74%

What type of PR is this:

  • [ ] BugFix
  • [ ] Feature
  • [x] Enhancement
  • [ ] Refactor
  • [ ] UT
  • [ ] Doc
  • [ ] Tool

Does this PR entail a change in behavior?

  • [ ] Yes, this PR will result in a change in behavior.
  • [x] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • [ ] Parameter changes: default values, similar parameters but with different default values
  • [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
  • [ ] Feature removed
  • [ ] Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • [ ] I have added test cases for my bug fix or my new feature
  • [ ] This pr needs user documentation (for new or modified features or behaviors)
    • [ ] I have added documentation for my new feature or new function
  • [ ] This is a backport pr

Bugfix cherry-pick branch check:

  • [x] I have checked the version labels which the pr will be auto-backported to the target branch
    • [x] 3.3
    • [ ] 3.2
    • [ ] 3.1
    • [ ] 3.0
    • [ ] 2.5

silverbullet233 avatar Apr 29 '24 07:04 silverbullet233

Quality Gate Failed Quality Gate failed

Failed conditions
B Reliability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

sonarqubecloud[bot] avatar Jun 04 '24 08:06 sonarqubecloud[bot]

[FE Incremental Coverage Report]

:white_check_mark: pass : 4 / 4 (100.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: com/starrocks/qe/SessionVariable.java 4 4 100.00% []

github-actions[bot] avatar Jun 04 '24 10:06 github-actions[bot]

[BE Incremental Coverage Report]

:white_check_mark: pass : 83 / 91 (91.21%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: be/src/exec/spill/block_reader.cpp 39 47 82.98% [67, 68, 69, 74, 75, 76, 77, 78]
:large_blue_circle: be/src/exec/spill/input_stream.cpp 16 16 100.00% []
:large_blue_circle: be/src/exec/spill/file_block_manager.cpp 6 6 100.00% []
:large_blue_circle: be/src/exec/pipeline/hashjoin/spillable_hash_join_probe_operator.cpp 2 2 100.00% []
:large_blue_circle: be/src/exec/pipeline/sort/spillable_partition_sort_sink_operator.cpp 2 2 100.00% []
:large_blue_circle: be/src/exec/pipeline/aggregate/spillable_aggregate_blocking_sink_operator.cpp 2 2 100.00% []
:large_blue_circle: be/src/exec/pipeline/hashjoin/spillable_hash_join_build_operator.cpp 2 2 100.00% []
:large_blue_circle: be/src/exec/spill/block_manager.h 1 1 100.00% []
:large_blue_circle: be/src/runtime/runtime_state.h 2 2 100.00% []
:large_blue_circle: be/src/exec/spill/log_block_manager.cpp 5 5 100.00% []
:large_blue_circle: be/src/exec/pipeline/aggregate/spillable_aggregate_distinct_blocking_operator.cpp 2 2 100.00% []
:large_blue_circle: be/src/exec/spill/serde.h 1 1 100.00% []
:large_blue_circle: be/src/exec/spill/serde.cpp 1 1 100.00% []
:large_blue_circle: be/src/exec/pipeline/nljoin/spillable_nljoin_build_operator.cpp 2 2 100.00% []

github-actions[bot] avatar Jun 04 '24 10:06 github-actions[bot]

@Mergifyio backport branch-3.3

github-actions[bot] avatar Jun 05 '24 13:06 github-actions[bot]

backport branch-3.3

✅ Backports have been created

mergify[bot] avatar Jun 05 '24 13:06 mergify[bot]