containers-roadmap
containers-roadmap copied to clipboard
[EKS]: One-click Full Cluster Upgrade
Automation in EKS as a service has come in only bits and pieces. EKS with managed nodes is not really that useful without having a one click full upgrade where EKS version, aws-node, dns etc. along with the worker nodes get upgraded without running or orchestrating commands manually. This can be broken down into 2 pieces 1) Full cluster upgrade with nodes 2) Only worker node upgrades for ongoing AMI rotation. This is a critical functionality
Renaming this to 'One-click Full Cluster Upgrade'. Today EKS managed nodes supports worker node upgrades for AMI rotation, but this is not yet in the console (https://github.com/aws/containers-roadmap/issues/605). See API documentation here: https://docs.aws.amazon.com/eks/latest/APIReference/API_UpdateNodegroupVersion.html
I just read the official documentation of AWS to upgrade an eks cluster : we have to manually execute kubectl commands to upgrade critical components of Kubernetes. Really, for a "managed" service, is it a joke ? Even updating my on-premise cluster is easier than updating the called "managed" AWS Kubernetes service.
https://docs.aws.amazon.com/eks/latest/userguide/update-cluster.html
In Azure AKS, this is a single click operation with a version drop down.
The upgrade process cant get enough love, please work on this! It is incredibly important for production workloads to have a managed process for this :)
+1 for this
+1 EKS need to close the gaps with AKS and GKE
+1
This feature could impact people working under the assumption that AWS will not modify resources running inside the cluster, notably kube-proxy (#657)
+1
+1
Hi Team,
There is a sample package eks-one-click-cluster-upgrade in aws-samples which provides similar functionality. This is a cli utility which can be used to carry out upgrade. Please check this package and share your feedback.
We still can say that EKS it's not a fully managed k8s engine if we are going for every release to spend time and effort to check the release note for every addon.
One other aspect of upgrades that would be great to be a part of this effort would be the ability to configure the upgrade timeout for node groups. Currently if it takes longer than 15 minutes to replace a node it fails the upgrade and rolls back.
Related doc: https://docs.aws.amazon.com/eks/latest/userguide/managed-node-update-behavior.html Relevant part:
Drains the pods from the node. If the pods don't leave the node within 15 minutes and there's no force flag, the upgrade phase fails with a PodEvictionFailure error. For this scenario, you can apply the force flag with the update-nodegroup-version request to delete the pods.
+1, we must have it.
I'm not really understanding the value here. In a cluster where you've got many addons to make the cluster function such as istio, Argo, external-dns, Prometheus, cert-manager...etc.etc. VPC CNI/kube-proxy bumping is trivial task and easily automated via your gitops managment methods.
Is the target for the customers with more out of the box Kubernetes with a basic set of aws provided addons ?