AKS Support for topology spread constraints with cluster autoscaler

When a deployment is applied using topology spread constraints with a maxSkew of 1 and topology key "topology.kubernetes.io/zone" the cluster autoscaler scales up 1 zone with too many nodes and after the scale-down-unneeded-time has passed they will be removed again. There is a nodepool for each zone (3) and balance-similar-node-groups is set to true.

I would expect nodes to be added to each zone with the similar number of nodes and not unneeded nodes extra being added which are removed again the after scale-down-unneeded-time timeout.

The issue can be reproduced by applying a deployment with resource requests sized about half the size of the nodes, about 30 pods and with topology spread constraints configured: topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule Cluster setup with autoscaled nodepools per zone and with balance-similar-node-groups set to true..

MS support ticket 2112070050001650 was opened for this issue. Was told there is no special integration between pod topology spreading and CA as in this this behavior is expected. Advised to open issue and request for integration of topology spread constraints.

Kubernetes 1.21.9 1 system nodepool (Standard_D16as_v4) with 3 nodes (no autoscaling) 3 user nodepools (1 per zone, Standard_D16as_v4) and cluster autoscaling (3 - 30)

Mar 17 '22 12:03 martin-adema

Hi martin-adema, AKS bot here :wave: Thank you for posting on the AKS Repo, I'll do my best to get a kind human from the AKS team to assist you.

I might be just a bot, but I'm told my suggestions are normally quite good, as such:

If this case is urgent, please open a Support Request so that our 24/7 support team may help you faster.
Please abide by the AKS repo Guidelines and Code of Conduct.
If you're having an issue, could it be described on the AKS Troubleshooting guides or AKS Diagnostics?
Make sure your subscribed to the AKS Release Notes to keep up to date with all that's new on AKS.
Make sure there isn't a duplicate of this issue already reported. If there is, feel free to close this one and '+1' the existing issue.
If you have a question, do take a look at our AKS FAQ. We place the most common ones there!

Mar 17 '22 12:03 ghost

Triage required from @Azure/aks-pm

Mar 19 '22 18:03 ghost

Action required from @Azure/aks-pm

Mar 24 '22 19:03 ghost

Issue needing attention of @Azure/aks-leads

Apr 09 '22 00:04 ghost

Issue needing attention of @Azure/aks-leads

Apr 24 '22 00:04 ghost

Issue needing attention of @Azure/aks-leads

May 09 '22 06:05 ghost

Issue needing attention of @Azure/aks-leads

May 24 '22 06:05 ghost

Issue needing attention of @Azure/aks-leads

Jun 08 '22 06:06 ghost

Issue needing attention of @Azure/aks-leads

Jun 23 '22 06:06 ghost

Issue needing attention of @Azure/aks-leads

Jul 08 '22 12:07 ghost

Issue needing attention of @Azure/aks-leads

Jul 23 '22 12:07 ghost

Issue needing attention of @Azure/aks-leads

Aug 07 '22 12:08 ghost

Issue needing attention of @Azure/aks-leads

Aug 22 '22 12:08 ghost

Issue needing attention of @Azure/aks-leads

Sep 06 '22 18:09 ghost

@justindavies could you help take a look?

Sep 21 '22 18:09 wangyira

Hi my customer has the same issue using PodAntiaffinity and the CA never triggering when pods cannot be scheduled.

To be sure, is it linked to this issue ?

Oct 10 '22 12:10 lgmorand

Action required from @Azure/aks-pm

Nov 09 '22 19:11 ghost

Issue needing attention of @Azure/aks-leads

Nov 25 '22 00:11 ghost

Issue needing attention of @Azure/aks-leads

Dec 10 '22 00:12 ghost

Issue needing attention of @Azure/aks-leads

Dec 25 '22 06:12 ghost

Issue needing attention of @Azure/aks-leads

Jan 09 '23 06:01 ghost

Issue needing attention of @Azure/aks-leads

Jan 24 '23 12:01 ghost

Issue needing attention of @Azure/aks-leads

Feb 08 '23 18:02 ghost

Issue needing attention of @Azure/aks-leads

Feb 24 '23 00:02 ghost

Issue needing attention of @Azure/aks-leads

Mar 11 '23 06:03 ghost

Issue needing attention of @Azure/aks-leads

Mar 26 '23 12:03 ghost

Issue needing attention of @Azure/aks-leads

Apr 10 '23 18:04 ghost

Issue needing attention of @Azure/aks-leads

Apr 26 '23 00:04 ghost

Issue needing attention of @Azure/aks-leads

May 11 '23 06:05 ghost

Issue needing attention of @Azure/aks-leads

May 26 '23 12:05 ghost

AKS AKS copied to clipboard

Support for topology spread constraints with cluster autoscaler

AKS
AKS copied to clipboard