AKS icon indicating copy to clipboard operation
AKS copied to clipboard

Support for topology spread constraints with cluster autoscaler

Open martin-adema opened this issue 2 years ago • 37 comments

When a deployment is applied using topology spread constraints with a maxSkew of 1 and topology key "topology.kubernetes.io/zone" the cluster autoscaler scales up 1 zone with too many nodes and after the scale-down-unneeded-time has passed they will be removed again. There is a nodepool for each zone (3) and balance-similar-node-groups is set to true.

I would expect nodes to be added to each zone with the similar number of nodes and not unneeded nodes extra being added which are removed again the after scale-down-unneeded-time timeout.

The issue can be reproduced by applying a deployment with resource requests sized about half the size of the nodes, about 30 pods and with topology spread constraints configured: topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule Cluster setup with autoscaled nodepools per zone and with balance-similar-node-groups set to true..

MS support ticket 2112070050001650 was opened for this issue. Was told there is no special integration between pod topology spreading and CA as in this this behavior is expected. Advised to open issue and request for integration of topology spread constraints.

Kubernetes 1.21.9 1 system nodepool (Standard_D16as_v4) with 3 nodes (no autoscaling) 3 user nodepools (1 per zone, Standard_D16as_v4) and cluster autoscaling (3 - 30)

martin-adema avatar Mar 17 '22 12:03 martin-adema

Hi martin-adema, AKS bot here :wave: Thank you for posting on the AKS Repo, I'll do my best to get a kind human from the AKS team to assist you.

I might be just a bot, but I'm told my suggestions are normally quite good, as such:

  1. If this case is urgent, please open a Support Request so that our 24/7 support team may help you faster.
  2. Please abide by the AKS repo Guidelines and Code of Conduct.
  3. If you're having an issue, could it be described on the AKS Troubleshooting guides or AKS Diagnostics?
  4. Make sure your subscribed to the AKS Release Notes to keep up to date with all that's new on AKS.
  5. Make sure there isn't a duplicate of this issue already reported. If there is, feel free to close this one and '+1' the existing issue.
  6. If you have a question, do take a look at our AKS FAQ. We place the most common ones there!

ghost avatar Mar 17 '22 12:03 ghost

Triage required from @Azure/aks-pm

ghost avatar Mar 19 '22 18:03 ghost

Action required from @Azure/aks-pm

ghost avatar Mar 24 '22 19:03 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Apr 09 '22 00:04 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Apr 24 '22 00:04 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar May 09 '22 06:05 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar May 24 '22 06:05 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Jun 08 '22 06:06 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Jun 23 '22 06:06 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Jul 08 '22 12:07 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Jul 23 '22 12:07 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Aug 07 '22 12:08 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Aug 22 '22 12:08 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Sep 06 '22 18:09 ghost

@justindavies could you help take a look?

wangyira avatar Sep 21 '22 18:09 wangyira

Hi my customer has the same issue using PodAntiaffinity and the CA never triggering when pods cannot be scheduled.

To be sure, is it linked to this issue ?

lgmorand avatar Oct 10 '22 12:10 lgmorand

Action required from @Azure/aks-pm

ghost avatar Nov 09 '22 19:11 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Nov 25 '22 00:11 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Dec 10 '22 00:12 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Dec 25 '22 06:12 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Jan 09 '23 06:01 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Jan 24 '23 12:01 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Feb 08 '23 18:02 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Feb 24 '23 00:02 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Mar 11 '23 06:03 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Mar 26 '23 12:03 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Apr 10 '23 18:04 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Apr 26 '23 00:04 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar May 11 '23 06:05 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar May 26 '23 12:05 ghost