fix the issue where federatedresourcequota becomes invalid when the workload scales up
What type of PR is this? /kind bug
What this PR does / why we need it:
We introduced federatedresourcequota enforcement in v1.14 to achieve unified quota management across clusters. This is realized by intercepting modifications to workload requirements and the spec.clusters field of resource bindings. However, the federated resource quota mechanism fails when workloads scale up.
The root cause is that the binding controller synchronizes workload modifications to member clusters. When the replicas scheduling policy is set to "duplicated", the binding controller does not revise the workload's replicas according to the scheduling results in bindingSpec.clusters. As a result, the scale-up operations of workloads are directly synchronized to member clusters, rendering the federated resource quota ineffective.
The detailed process is as follows:
- Edit the replicas of a workload.
- The binding controller synchronizes the workload's replicas to member clusters.
- Karmada-scheduler performs rescheduling, and the request to update
bindingSpec.clustersis intercepted due to insufficient quota.
As described above, even though the update request for bindingSpec.clusters is blocked, the replicas of the workload in member clusters are still synchronized, leading to the failure of the federated resource quota.
This PR modifies the logic for the binding controller to sync workloads to member clusters. It syncs the replicas in bindingSpec.clusters to member clusters instead of the replicas of the workload.
Which issue(s) this PR fixes: Parts of #6467
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
`karmada-controller-manager`: fix the issue where federatedresourcequota becomes invalid when the workload scales up
:warning: Please install the to ensure uploads and comments are reliably processed by Codecov.
Codecov Report
Attention: Patch coverage is 30.00000% with 14 lines in your changes missing coverage. Please review.
Project coverage is 48.98%. Comparing base (
571801f) to head (7b9bf2f). Report is 15 commits behind head on master.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| pkg/controllers/binding/common.go | 30.00% | 13 Missing and 1 partial :warning: |
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@ Coverage Diff @@
## master #6474 +/- ##
==========================================
- Coverage 49.03% 48.98% -0.06%
==========================================
Files 687 687
Lines 56038 56097 +59
==========================================
Hits 27478 27478
- Misses 26779 26839 +60
+ Partials 1781 1780 -1
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 48.98% <30.00%> (-0.06%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@mszacillo Do you think it's necessary to backport this PR to release-1.14?
@mszacillo Do you think it's necessary to backport this PR to
release-1.14?
Yes I think so. I currently rebased our internal branch against the latest 1.14 release, so having this fix in that release would be helpful.
/retitle Fix workload scale up bypass FederatedResourceQuota check issue
@mszacillo Do you think it's necessary to backport this PR to
release-1.14?Yes I think so. I currently rebased our internal branch against the latest 1.14 release, so having this fix in that release would be helpful.
I don't mind backporting this.
But I don't think this issue will affect @mszacillo's scenario, as it only applies to workloads using Duplicated mode.
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: mszacillo, RainbowMango
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~pkg/controllers/OWNERS~~ [RainbowMango,mszacillo]
- ~~test/OWNERS~~ [RainbowMango]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment