starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Enhancement] merge fragment-profile just in time

Open murphyatwork opened this issue 1 year ago • 5 comments

Why I'm doing:

Currently the query profile is generated in a hierarchy way:

  • Each backend would report the profiles in fragment instance level
  • FE would maintain all these profiles in memory before query finish
  • As a result, with more fragments, more backends, this running profile would take more memory
  • For a large-size cluster, this query profile can easily take hundreds MBs of FE memory

The tradeoff:

  1. If we want to provide running profile: you have to maintain all instance profiles for running instances
  2. If we don't need it, we only need to maintain the fragment-level, then consolidate them into a query profile when finish

What I'm doing:

Merge the instance-level profile into fragment-level after it finishes:

  1. The correctness is based on that the counters would never change after it finishes
  2. It can greatly reduce memory consumption for a large-size cluster

Fixes #issue

What type of PR is this:

  • [ ] BugFix
  • [ ] Feature
  • [x] Enhancement
  • [ ] Refactor
  • [ ] UT
  • [ ] Doc
  • [ ] Tool

Does this PR entail a change in behavior?

  • [ ] Yes, this PR will result in a change in behavior.
  • [x] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • [ ] Parameter changes: default values, similar parameters but with different default values
  • [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
  • [ ] Feature removed
  • [ ] Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • [ ] I have added test cases for my bug fix or my new feature
  • [ ] This pr needs user documentation (for new or modified features or behaviors)
    • [ ] I have added documentation for my new feature or new function
  • [ ] This is a backport pr

Bugfix cherry-pick branch check:

  • [x] I have checked the version labels which the pr will be auto-backported to the target branch
    • [x] 3.3
    • [ ] 3.2
    • [ ] 3.1
    • [ ] 3.0
    • [ ] 2.5

murphyatwork avatar Sep 12 '24 11:09 murphyatwork

suggest to add a fe config, so if anything goes wong, we can set it as false

before-Sunrise avatar Oct 18 '24 06:10 before-Sunrise

Quality Gate Failed Quality Gate failed

Failed conditions
4.7% Duplication on New Code (required ≤ 3%)
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

sonarqubecloud[bot] avatar Oct 18 '24 11:10 sonarqubecloud[bot]

@cursor review

murphyatwork avatar Jul 27 '25 11:07 murphyatwork

@Mergifyio rebase

murphyatwork avatar Aug 01 '25 07:08 murphyatwork

rebase

✅ Branch has been successfully rebased

mergify[bot] avatar Aug 01 '25 07:08 mergify[bot]

@Mergifyio rebase

murphyatwork avatar Aug 06 '25 02:08 murphyatwork

rebase

✅ Branch has been successfully rebased

mergify[bot] avatar Aug 06 '25 02:08 mergify[bot]

Quality Gate Failed Quality Gate failed

Failed conditions
5.1% Duplication on New Code (required ≤ 3%)
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

sonarqubecloud[bot] avatar Aug 06 '25 02:08 sonarqubecloud[bot]

[FE Incremental Coverage Report]

:x: fail : 227 / 298 (76.17%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: com/starrocks/qe/StmtExecutor.java 0 24 00.00% [1101, 1102, 1103, 1104, 1105, 1106, 1127, 1128, 1130, 1131, 1132, 1134, 1135, 1137, 1138, 1139, 1142, 1143, 1144, 1145, 1146, 1147, 1149, 1154]
:large_blue_circle: com/starrocks/qe/scheduler/QueryRuntimeProfile.java 6 18 33.33% [327, 328, 329, 330, 331, 332, 447, 448, 454, 455, 456, 457]
:large_blue_circle: com/starrocks/qe/scheduler/dag/FragmentInstanceExecState.java 3 6 50.00% [325, 326, 330]
:large_blue_circle: com/starrocks/qe/DefaultCoordinator.java 4 5 80.00% [1123]
:large_blue_circle: com/starrocks/common/profile/SummarizationCounter.java 24 30 80.00% [57, 58, 60, 61, 62, 63]
:large_blue_circle: com/starrocks/common/profile/Counter.java 63 72 87.50% [80, 81, 88, 89, 92, 93, 111, 112, 217]
:large_blue_circle: com/starrocks/common/util/RuntimeProfile.java 126 142 88.73% [144, 152, 156, 320, 356, 364, 365, 366, 372, 617, 618, 619, 620, 621, 622, 623]
:large_blue_circle: com/starrocks/common/Config.java 1 1 100.00% []

github-actions[bot] avatar Aug 06 '25 05:08 github-actions[bot]

ignore merge 4.0.0-rc01

wangsimo0 avatar Aug 11 '25 09:08 wangsimo0

don't forget to cp to branch-4.0

wangsimo0 avatar Aug 11 '25 09:08 wangsimo0

@mergifyio rebase

murphyatwork avatar Nov 25 '25 09:11 murphyatwork

@cursor review

murphyatwork avatar Nov 25 '25 09:11 murphyatwork

rebase

❌ Base branch update has failed

Git reported the following error:

Rebasing (1/6)
Auto-merging fe/fe-core/src/main/java/com/starrocks/common/Config.java
CONFLICT (content): Merge conflict in fe/fe-core/src/main/java/com/starrocks/common/Config.java
Auto-merging fe/fe-core/src/main/java/com/starrocks/common/util/RuntimeProfile.java
Auto-merging fe/fe-core/src/main/java/com/starrocks/qe/DefaultCoordinator.java
Auto-merging fe/fe-core/src/main/java/com/starrocks/qe/StmtExecutor.java
Auto-merging fe/fe-core/src/main/java/com/starrocks/qe/scheduler/QueryRuntimeProfile.java
Auto-merging fe/fe-core/src/main/java/com/starrocks/qe/scheduler/dag/FragmentInstanceExecState.java
Auto-merging fe/fe-core/src/main/java/com/starrocks/sql/ExplainAnalyzer.java
error: could not apply f2f2b50f1c... merge instance profile into fragment profile to reduce memory consumption
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
Could not apply f2f2b50f1c... merge instance profile into fragment profile to reduce memory consumption

mergify[bot] avatar Nov 25 '25 09:11 mergify[bot]

[Java-Extensions Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] avatar Nov 25 '25 09:11 github-actions[bot]

[BE Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] avatar Nov 25 '25 09:11 github-actions[bot]

@cursor rebase main branch, resolve conflicts

murphyatwork avatar Dec 01 '25 10:12 murphyatwork