karmada icon indicating copy to clipboard operation
karmada copied to clipboard

[Umbrella] Optimize Cluster Failover: no longer automatically add the NoExecute taint

Open XiShanYongYe-Chang opened this issue 7 months ago • 21 comments

Background

I put forward a proposal to optimize Cluster Failover in #6274. Now, according to the content designed in the proposal, the content that needs to be modified is divided as follows:

  • Proposal @XiShanYongYe-Chang #6274

release-1.14

  • [ ] No longer automatically add the NoExecute taint, users use Failover more safely
    • [x] API Change
      • [x] Add the ClusterTaintPolicy API (Time dimension condition matching is not supported in the API) @XiShanYongYe-Chang #6319
      • [x] Add comments for field updation and policy deletion for ClusterTaintPolicy API @XiShanYongYe-Chang #6390
    • [x] karmada-webhook
      • [x] Add the ClusterTaintPolicy validation @XiShanYongYe-Chang #6370
      • [x] Add the support-no-execute flag @XiShanYongYe-Chang #6370
      • [x] Deprecated default-not-ready-toleration-seconds and default-unreachable-toleration-seconds flags @XiShanYongYe-Chang #6373
    • [x] karmada-controller-manager
      • [x] Add the ClusterTaintPolicy controller to handle taints on the clusters (Time-dimension condition matching is currently not considered) @XiShanYongYe-Chang #6368 #6399
      • [x] cluster-controller will no longer automatically add not-ready and unreachable NoExecute taints to clusters @tiansuo114 #6389
      • [x] cluster-controller will no longer automatically add terminating taint to the deleting cluster @XiShanYongYe-Chang #6403
      • [x] Deprecate failover-evition-timeout flag @RainbowMango #6405
    • [x] Feature control
      • [x] Adjust the control scope of the Failover FeatureGate @mszacillo #6400
      • [x] Add failover flags to control the behavior of NoExecute @XiShanYongYe-Chang #6404
    • [x] UT&E2E @XiShanYongYe-Chang #6412
    • [ ] Documentation @XiShanYongYe-Chang https://github.com/karmada-io/website/pull/843

XiShanYongYe-Chang avatar Apr 23 '25 03:04 XiShanYongYe-Chang

Please organize the tasks by releases, we need the cope for v1.14.

RainbowMango avatar Apr 28 '25 09:04 RainbowMango

Please organize the tasks by releases, we need the cope for v1.14.

The version plan has been identified after each subtask.

XiShanYongYe-Chang avatar Apr 28 '25 11:04 XiShanYongYe-Chang

Hello,I’m very interested in the following two issues—could they be assigned to me?

  • The system will no longer automatically add taints to clusters
  • Default ClusterTaintPolicy: Adds NoSchedule taint to clusters

I also have some questions about the second issue:

Does “ClusterTaintPolicy” here refer specifically to the ClusterTaintPolicy controller, or to something else?f it does refer to the controller, do we need to wait until this issue is completed before we can begin work on this?

  • Add the ClusterTaintPolicy controller to handle taints on the clusters

If that’s the case, here’s my rough plan for tackling the first issue—I’d appreciate any feedback or suggestions:

  • It looks like pkg/controllers/cluster/cluster_controller.go is the core of our taint logic. In that file we define several taint templates (e.g., UnreachableTaintTemplate, NotReadyTaintTemplate, etc.) that syncCluster and related functions invoke.
  • To disable automatic tainting, we could remove (or guard) the call to taintClusterByCondition inside the syncCluster function. That should stop the controller from automatically managing taints on clusters.

Thanks for considering my request—any guidance you can provide would be great!

tiansuo114 avatar May 10 '25 13:05 tiansuo114

The system will no longer automatically add taints to clusters

Thanks @tiansuo114 The system will no longer automatically add taints to clusters

Does “ClusterTaintPolicy” here refer specifically to the ClusterTaintPolicy controller, or to something else?f it does refer to the controller, do we need to wait until this issue is completed before we can begin work on this?

Yes, you are wright, Let me add the ClusterTaintPolicy controller first.

XiShanYongYe-Chang avatar May 12 '25 01:05 XiShanYongYe-Chang

The system will no longer automatically add taints to clusters

Hi @tiansuo114 If you would like, could we give you the task first?

XiShanYongYe-Chang avatar May 13 '25 12:05 XiShanYongYe-Chang

The system will no longer automatically add taints to clusters

Hi @tiansuo114 If you would like, could we give you the task first?

Ok. I'm trying to complete this task and I'm very glad to do it together

tiansuo114 avatar May 13 '25 14:05 tiansuo114

Hi @XiShanYongYe-Chang, if you need any help I can pick something up as well!

Out of curiosity, will we be adding Make changes to taint manager to build PreservedLabelState when triggering eviction. to the iteration tasks? This was originally postponed in the original stateful failover umbrella task: #5788, but we would still be interested in supporting this as state preservation should occur during cluster failover as well.

mszacillo avatar May 16 '25 20:05 mszacillo

Hi @mszacillo thanks a lot~ I am very glad to have your help. The tasks planned for version 1.14 may not be that easy to carry out, as they depend on modifications to the API.

Out of curiosity, will we be adding Make changes to taint manager to build PreservedLabelState when triggering eviction. to the iteration tasks?

The current work has not yet included this content. Once the Cluster Failover refactoring for v1.14 is basically completed, this work will proceed more smoothly.

This was originally postponed in the original stateful failover umbrella task: https://github.com/karmada-io/karmada/issues/5788, but we would still be interested in supporting this as state preservation should occur during cluster failover as well.

This is part of my plan for version 1.15. I apologize for not communicating this with you in a timely manner. In fact, I have already written a Chinese design document. Currently, my main focus is on optimizing Cluster Failover. I will push this work forward as soon as possible at the beginning of next month. I look forward to your communication and feedback.

XiShanYongYe-Chang avatar May 17 '25 11:05 XiShanYongYe-Chang

@XiShanYongYe-Chang sounds great!

Perhaps I can pick up one of these?

  • Adjust the control scope of the Failover FeatureGate
  • When the same cluster exists in both the scheduling result and the gracefulEvictionTask at the same time, clean up the cluster in the latter

This is part of my plan for version 1.15. I apologize for not communicating this with you in a timely manner. In fact, I have already written a Chinese design document.

Understood. Thanks for the update! I'll be on the lookout for the 1.15 work, which will include stateful failover items.

mszacillo avatar May 20 '25 13:05 mszacillo

Thanks @mszacillo I'll assign these two tasks to you first.

XiShanYongYe-Chang avatar May 21 '25 01:05 XiShanYongYe-Chang

@XiShanYongYe-Chang Sounds great, thank you. I had some follow-up questions to make sure I understand the task requirements -

  1. GracefulEvictionTask clean issue

    • With the ClusterEviction plugin deprecation, what do we do in cases of Application failovers? In those cases, we should still filter out the fromCluster out of the feasible clusters.
    • As for the implementation, I am planning on updating the ClusterInGracefulEvictionTasks method to ignore clusters that are in the graceful eviction task, if the reason is EvictionReasonTaintUntolerated. That way we still cover ApplicationFailures.
    • I'll add some logic to the graceful_eviction_controller to clean a task if the scheduler has updated the rb.clusters with the same cluster.
  2. Adjust the control scope of the Failover FeatureGate

    • I'll be using the Failover FeatureGate in the PR for the GracefulEvictionTask clean issue. Is there another spot where the scope of the feature gate needs to be changed?

mszacillo avatar May 22 '25 03:05 mszacillo

With the ClusterEviction plugin deprecation, what do we do in cases of Application failovers? In those cases, we should still filter out the fromCluster out of the feasible clusters.

Thanks @mszacillo What you said is crucial. The obvious contradiction between Cluster failover and application failover is not resolved. The ClusterEviction plug-in cannot be directly discarded.

I'll be using the Failover FeatureGate in the PR for the GracefulEvictionTask clean issue. Is there another spot where the scope of the feature gate needs to be changed?

I think the gracefulEvictionTask clearing work can be ignored in the failover, because the application failover also uses this capability. We only need to control the processing of taint, that is, the work of taint-manager.

XiShanYongYe-Chang avatar May 22 '25 03:05 XiShanYongYe-Chang

The ClusterEviction plug-in cannot be directly discarded.

Hi @mszacillo, I've thought about it again. To avoid affecting the application failover, this subtask is suspended first. I'll consider other solutions. Thank you for your question.

XiShanYongYe-Chang avatar May 22 '25 08:05 XiShanYongYe-Chang

I think the gracefulEvictionTask clearing work can be ignored in the failover, because the application failover also uses this capability. We only need to control the processing of taint, that is, the work of taint-manager.

Sounds good. I have opened a PR which gates the TaintManager based on the Failover feature-gate, and also deprecates the GracefulEviction feature-gate. Tested locally and confirmed that taint-based eviction does not take place if the Failover flag is false. Changing the flag value to true allows taint-based eviction to work.

mszacillo avatar May 22 '25 16:05 mszacillo

@XiShanYongYe-Chang Given this issue was tagged with release-1.14, can you help to open another umbrella issue to track tasks planned in release-1.15? (Move tasks to that issue as well if they are planned in release-1.16+)

RainbowMango avatar Jun 04 '25 06:06 RainbowMango

@XiShanYongYe-Chang Given this issue was tagged with release-1.14, can you help to open another umbrella issue to track tasks planned in release-1.15? (Move tasks to that issue as well if they are planned in release-1.16+)

Great advice, 1.14 work is almost finished. Let me do it.

XiShanYongYe-Chang avatar Jun 04 '25 06:06 XiShanYongYe-Chang

As @RainbowMango 's command, I split the remaining tasks in the v1.15/1.16 to #6428. Let's continue the cluster failover iteration on that umbrella issue.

XiShanYongYe-Chang avatar Jun 04 '25 06:06 XiShanYongYe-Chang

/assign @RainbowMango @mszacillo @tiansuo114 @XiShanYongYe-Chang

XiShanYongYe-Chang avatar Jun 04 '25 06:06 XiShanYongYe-Chang

/kind feature

XiShanYongYe-Chang avatar Jun 04 '25 06:06 XiShanYongYe-Chang

As @RainbowMango 's command, I split the remaining tasks in the v1.15/1.16 to #6242. Let's continue the cluster failover iteration on that umbrella issue.

Not 6242

RainbowMango avatar Jun 04 '25 07:06 RainbowMango

Not 6242

Update, thanks for reminding.

XiShanYongYe-Chang avatar Jun 04 '25 07:06 XiShanYongYe-Chang

/close

RainbowMango avatar Jun 24 '25 03:06 RainbowMango

@RainbowMango: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

karmada-bot avatar Jun 24 '25 03:06 karmada-bot