cluster-api
cluster-api copied to clipboard
CAPI controller should taint outdated nodes with PreferNoSchedule
User Story
As a operator of the cluster API, when I make changes to an existing MachineDeployment or MachineSet infrastructure template, any existing nodes previously managed are reconciled, replaced and then drained by the cluster api controllers.
If the MachineSet or MachineDeployment has many replicas, and each node has many pods, this can result in unnecessary pod churn. As the first node is drained, pods previously running on that node may be scheduled onto nodes who have yet to be replaced, but will be torn down soon. When the cluster api controller finally drains those nodes, those pods are evicted again, and are rescheduled. In sufficiently deep clusters, this may result in workloads being evicted and restarted unnecessarily on doomed nodes many times.
I would like the cluster api controller to taint all of the nodes it will be replacing with PreferNoSchedule
so that pods prefer scheduling on newer nodes, and only fall back to scheduling on older outdated nodes if the cluster has no alternative capacity.
Detailed Description
The cluster api controller should taint all of the nodes in outdated machine deployment or machine sets with PreferNoSchedule.
Anything else you would like to add:
N/A
/kind feature
Sounds like a good idea to me. We used something like this before in another project and it also generally speeds up upgrades when PDBs are used as less Pod drains are required.
/triage accepted
I'm +1, TBD if to implement this as a default behaviour or behind some feature flag/annotation @enxebre @vincepri @CecileRobertMichon opinions?
+1 to soft taint feature during MachineDeployment rolling upgrades. I'd expect tainting to be driven by the MachineDeployment controller as it reconciles old MachineSets. I'm +1 to implement as default behaviour while covered by unit, e2e, conformance testing.
Hi,
If there's no assignee I can handle it based on the idea described in https://github.com/kubernetes-sigs/cluster-api/issues/7043#issue-1333875124. Could I working on it (would you allow me to do /lifecycle active
?)
Hi @hiromu-a5a, feel free to comment /assign
if you'd like to work on it
/lifecycle active
/assign
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
/lifecycle frozen
/lifecycle active