AKS
AKS copied to clipboard
[Question] AKS Update Rollout Policy
Describe scenario
We run three clusters (DEV, QA and PROD), and were wondering a few things regarding the rollout of AKS updates. By this I mean updates to the managed services, not kubernetes or the nodes as these are covered elsewhere (and are user-manageable).
Concretely, the loadbalancer health probe changes for k8s 1.24+ are a point of contention where we were left wondering when this change would roll out, as we experienced downtime due to this change (luckily only in the DEV cluster).
Question
- Does Azure apply AKS updates (loadbalancer changes, addon updates, etc.) whenever it chooses to, or are changes only made when some other update is triggered by the user (i.e. a node upgrade)?
- Does Azure respect the maintenance windows for a cluster when updating dependent resources (i.e. the loadbalancer)?
- Is there any way to see which AKS Release is being applied to a cluster? I know there's the rollout status viewer, but that is woefully underwhelming and nigh useless due to inaccuracies in the real state.
- Is there a way to define that a new AKS release should be applied to a certain cluster, before others (i.e. first to QA before PROD)?
After having Microsoft Support get back to me it's been verified that AKS really will deploy breaking changes to running clusters, with little to no warning. This is quite crazy behaviour IMO, the "planned maintenance" preview kinda helps here but not entirely so I am left wondering why this decision was made...
Action required from @Azure/aks-pm
Issue needing attention of @Azure/aks-leads
Issue needing attention of @Azure/aks-leads
Issue needing attention of @Azure/aks-leads
Issue needing attention of @Azure/aks-leads
FWIW—a security patch was applied to our nodes which were then rebooted, all without any warning whatsoever. Support has been unable to point us at any resource (web page, service health alert, etc) that would have told us it was coming, or even what happened after the fact. So I'm not sure that updates to the nodes are entirely user-manageable, either.
Issue needing attention of @Azure/aks-leads
Hi @siegenthalerroger and @dhduvall,
Tagging @kaarthis for viz
Thanks for reaching out. We appreciate your concerns. We will take this back to the team.
Action required from @Azure/aks-pm
Issue needing attention of @Azure/aks-leads
Issue needing attention of @Azure/aks-leads
Issue needing attention of @Azure/aks-leads
Issue needing attention of @Azure/aks-leads
Issue needing attention of @Azure/aks-leads
Issue needing attention of @Azure/aks-leads
Issue needing attention of @Azure/aks-leads
Issue needing attention of @Azure/aks-leads
Issue needing attention of @Azure/aks-leads
Issue needing attention of @Azure/aks-leads
Issue needing attention of @Azure/aks-leads
AKS Fleet Manager kinda solves these questions so I guess we're good. Would have been nice to get a comment from the team though.