[Enhancement] support schedule node by label
Why I'm doing: In shared nothing mode , the previous PR #38833 achieve the data replica placement strategy of backends with different locations. When execute query, want to limit to use the resource of some backends/computeNodes with specific labels , then can get resource hard isolation at machine level.
What I'm doing:
1.set property for user
add a new user property(labels.location) and support set property for user like:
SET PROPERTY for 'test' 'labels.location' = 'rack:group_a,rack:group_b';
(to be noticed: support multi level location property like a:b or a:b,c:d)
2.set label for node consider deploy backend only or deploy both backend and computeNode , set label for backend and computeNode. 2.1 for backend refer to the previous PR #38833
ALTER SYSTEM MODIFY BACKEND "IP1:9050" SET ("labels.location" = "rack:group_a");
ALTER SYSTEM MODIFY BACKEND "IP2:9050" SET ("labels.location" = "rack:group_b");
ALTER SYSTEM MODIFY BACKEND "IP3:9050" SET ("labels.location" = "rack:group_c");
2.2 for computeNode modify syntax to support add location property to computeNode like:
ALTER SYSTEM ADD COMPUTE NODE "IP4:9050" PROPERTIES ("labels.location" = "rack:group_a");
ALTER SYSTEM ADD COMPUTE NODE "IP5:9050" PROPERTIES ("labels.location" = "rack:group_b");
ALTER SYSTEM ADD COMPUTE NODE "IP6:9050" PROPERTIES ("labels.location" = "rack:group_c");
(to be noticed: only support single level location property like a:b now)
3.node selection when query 3.1 when deploy backend only the backend IP1+IP2 would be seleted 3.2 when deploy both backend and computeNode the backend IP1+IP2 and computeNode IP4+IP5 would be seleted
Fixes #issue
What type of PR is this:
- [ ] BugFix
- [ ] Feature
- [x] Enhancement
- [ ] Refactor
- [ ] UT
- [ ] Doc
- [ ] Tool
Does this PR entail a change in behavior?
- [x] Yes, this PR will result in a change in behavior.
- [ ] No, this PR will not result in a change in behavior.
If yes, please specify the type of change:
- [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
- [ ] Parameter changes: default values, similar parameters but with different default values
- [x] Policy changes: use new policy to replace old one, functionality automatically enabled
- [ ] Feature removed
- [ ] Miscellaneous: upgrade & downgrade compatibility, etc.
Checklist:
- [x] I have added test cases for my bug fix or my new feature
- [ ] This pr needs user documentation (for new or modified features or behaviors)
- [ ] I have added documentation for my new feature or new function
- [ ] This is a backport pr
Bugfix cherry-pick branch check:
- [x] I have checked the version labels which the pr will be auto-backported to the target branch
- [x] 3.3
- [ ] 3.2
- [ ] 3.1
- [ ] 3.0
- [ ] 2.5
Quality Gate failed
Failed conditions
C Reliability Rating on New Code (required ≥ A)
See analysis details on SonarCloud
Catch issues before they fail your Quality Gate with our IDE extension
SonarLint
[FE Incremental Coverage Report]
:x: fail : 95 / 125 (76.00%)
file detail
| path | covered_line | new_line | coverage | not_covered_line_detail | |
|---|---|---|---|---|---|
| :large_blue_circle: | com/starrocks/sql/ast/ShowUserPropertyStmt.java | 0 | 1 | 00.00% | [82] |
| :large_blue_circle: | com/starrocks/alter/SystemHandler.java | 0 | 5 | 00.00% | [297, 298, 299, 301, 302] |
| :large_blue_circle: | com/starrocks/system/SystemInfoService.java | 20 | 38 | 52.63% | [955, 956, 958, 959, 960, 961, 963, 964, 965, 985, 986, 988, 989, 990, 991, 993, 994, 995] |
| :large_blue_circle: | com/starrocks/authentication/AuthenticationMgr.java | 3 | 4 | 75.00% | [196] |
| :large_blue_circle: | com/starrocks/authentication/UserProperty.java | 16 | 19 | 84.21% | [156, 157, 336] |
| :large_blue_circle: | com/starrocks/qe/scheduler/DefaultWorkerProvider.java | 13 | 14 | 92.86% | [403] |
| :large_blue_circle: | com/starrocks/sql/analyzer/AlterSystemStmtAnalyzer.java | 18 | 19 | 94.74% | [99] |
| :large_blue_circle: | com/starrocks/sql/ast/AddComputeNodeClause.java | 1 | 1 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/ast/ComputeNodeClause.java | 6 | 6 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/system/ComputeNode.java | 11 | 11 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/common/proc/ComputeNodeProcDir.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/qe/ConnectContext.java | 5 | 5 | 100.00% | [] |
[BE Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)
@kevincai Hi Could you please review the pr when you have free time ? related UT and SQL-Tester have passed.
@alvin-celerdata Hi Could you please help review the PR when have free time? The PR achieve the calculate resource limitation of backends/computeNodes with specific labels at machine node level when run query, and associate with user property. I think it would be a better hard resource isolation in multi-tenancy environment in shared-nothing mode.
@alvin-celerdata Hi Could you please help review the PR when have free time? The PR achieve the calculate resource limitation of backends/computeNodes with specific labels at machine node level when run query, and associate with user property. I think it would be a better hard resource isolation in multi-tenancy environment in shared-nothing mode.
Before I deep dive into the code changes, I want to know what you are doing in this change. Could you create a GitHub feature request Issue ticket to describe it?
@alvin-celerdata Hi Could you please help review the PR when have free time? The PR achieve the calculate resource limitation of backends/computeNodes with specific labels at machine node level when run query, and associate with user property. I think it would be a better hard resource isolation in multi-tenancy environment in shared-nothing mode.
Before I deep dive into the code changes, I want to know what you are doing in this change. Could you create a GitHub feature request Issue ticket to describe it?
@alvin-celerdata Please look the PR description, it is clear. To be brief, in one cluster of shard-nothing mode, diffrent user could use exclusive resource(storage/computation) by label of node. Achieving a better hard resource isolation in multi-tenancy environment. like multi-warehourse in shard-data mode.