eks-rolling-update
eks-rolling-update copied to clipboard
Could not configure Kubernetes Python Client
Hi
I have been looking into this product and testing it, I am hitting this stumbling block of not being able to configure the kubernetes python client, the python client is installed, is this a known issue?, or any ways we can dig deeper in terms of what the kubernetes python client dependencies are?
[ root$] docker run -ti --rm -e AWS_DEFAULT_REGION -v "/root/.aws/config" -v "/root/.kube/us-gpd" eks-rolling-update:latest -c gpdeks1 2021-02-25 02:31:16,139 INFO Describing autoscaling groups... 2021-02-25 02:31:16,444 ERROR Could not configure kubernetes python client 2021-02-25 02:31:16,444 ERROR *** Rolling update of ASG has failed. Exiting *** 2021-02-25 02:31:16,444 ERROR AWS Auto Scaling Group processes will need resuming manually
Thanks
Looks like you are not using the default KUBECONFIG
location (us-gpd
vs config
) - you may need to override KUBECONFIG
or mount it to a different location per the docs and the example?
Thanks @chadlwilson
I followed the docs exactly, and got it running, final question, so it correctly gets the right node count, but i see no restarts, when it says rolling update complete does it only perform it once we upgrade the agents? Should we be expecting this to rollout the worker nodes if we have not upgraded them?
[ root@gpd-terraform-builder ~ $] docker run -ti --rm -e AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION} -v "$HOME/.aws":"/root/.aws" -v "$HOME/.kube/us-gpd":"/root/.kube/config" eks-rolling-update:latest -c gpdeks1 2021-02-25 15:16:25,772 INFO Describing autoscaling groups... 2021-02-25 15:16:27,483 INFO Getting k8s nodes... 2021-02-25 15:16:27,677 INFO Current k8s node count is 18 2021-02-25 15:16:27,678 INFO All asgs processed 2021-02-25 15:16:27,679 INFO *** Rolling update of all asg is complete! ***
That looks like a "no-op" run, so it probably thinks all nodes are up-to-date - that is they are already using the latest launch template version specified for the ASG(s), so there are no nodes to roll. There is age-based support with RUN_MODE=4
and MAX_ALLOWABLE_NODE_AGE
if that's what you are looking for.
You could try "touching" the launch template by modifying it without making new changes, so it saves a new version. You can then run with -p
to "plan" and you should see the tool identifying your nodes as being out of date. If you run it for real by removing -p
you'll see it trying to drain and terminate nodes.
Might want to do it on a smaller test cluster first, or a single ASG if you have multiple sharing the same launch template, using ASG_NAMES
filter :-)
excellent,
so this is what ill be after is something like this, as we want a pipeline where we can manually execute this job at any time: RUN_MODE=4 and MAX_ALLOWABLE_NODE_AGE=0
Do you have any examples of how we are supposed to pass these variables in?, are they just variables to add to our docker exec command?
They are just env vars that need to be available to Python, so when running with Docker can supply them the same way you supply any env vars into a container (with -e
, or --env-file
etc)