tack
tack copied to clipboard
Upgrading Kubernetes
Do you have plans on using k8s 1.4.0? If not, how can I upgrade my version?
Yes. Will release update later today.
On Sep 29, 2016, at 6:46 AM, Guilherme Cavalcanti [email protected] wrote:
Do you have plans on using k8s 1.4.0? If not, how can I upgrade my version?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
Will we be able to update an existing cluster?
@owenmorgan upgrading the k8s version requires deleting the etcd cluster, where all the kubernetes state is stored on ephemeral disk. I have a forked version of tack where etcd state is persisted on an ELB volume, and that works beautifully. Would anybody be interested in a PR to contribute that back to tack (@wellsie)? Let me know and I will clean up and submit.
In the meantime, you can use a simple workaround to recover from losing the etcd cluster. Before upgrading, you would run this code snippet.
This will allow you to recover all cluster state (including PV's). ELB's will be regenerated, so update any DNS records accordingly.
Thanks @adambom. how are we looking on an update @wellsie ?
@owenmorgan looks like it was patched in 8f2a62eaa4d3ade4930dcb5d80e8f2d3a072a8a7
Oh one other thing you'll need to do when you upgrade is taint or manually update the S3 bucket, so that the files in manifests/etc.tar point to the version of k8s you want to use. Otherwise the update won't actually take.
great. ill give it a shot. thanks @wellsie @adambom
is the backup / restore still necessary @adambom ?
i recommend upgrading the cluster manually. i will write up the procedure later this week - in the meantime here is the basic process:
update kubelet.service on worker nodes
- ssh into each node and update
KUBELET_VERSION
in/etc/systemd/system/kubelet.service
- valid KUBELET_VERSION values
-
sudo vim /etc/systemd/system/kubelet.service
-
sudo systemctl daemon-reload
-
sudo systemctl restart kubelet
make instances
(new with #77) will dump the ips of all nodes master (etcd,apiserver) and workers. do make ssh-bastion
and then from there ssh into each box one at a time.
update kubelet.service on etcd/apiserver nodes
repeat the above procedure for the master (etcd,apiserver) nodes.
update version in kubernetes manifests on etcd/apiserver nodes
grep 1.4 /etc/kubernetes/manifests/*
/etc/kubernetes/manifests/kube-apiserver.yml: image: quay.io/coreos/hyperkube:v1.4.0_coreos.0
/etc/kubernetes/manifests/kube-controller-manager.yml: image: quay.io/coreos/hyperkube:v1.4.0_coreos.0
/etc/kubernetes/manifests/kube-proxy.yml: image: quay.io/coreos/hyperkube:v1.4.0_coreos.0
/etc/kubernetes/manifests/kube-scheduler.yml: image: quay.io/coreos/hyperkube:v1.4.0_coreos.0
i'm looking into ways to automate this. it hasn't been a priority since the procedure is fairly straight forward. note that running pods should continue to run during this procedure.
would this procedure work: https://github.com/coreos/coreos-baremetal/blob/master/Documentation/bootkube-upgrades.md ?
@wellsie any update on the kubernetes automated update? it is fine to do those ^^^ commands manually if you have a small cluster, but with the big would be a headache :)
ok, have checked out to update /etc/systemd/system/kubelet.service
with the never k8s version, the change does not survive the reboot. :(
@rimusz, It is because tack use user-data, that run every time that the machine power up. You can stop the instance and edit the version on user-data and then start the instance.
I replace user-data with cloud init in my environment, if everything work fine i will submit a PR.
You can use this procedure
Update worker nodes
- Create a new launch configuration, you can clone the existing LC and edit the kubernetes version on user-data (have 2 Occurrence).
- Terminate all instances and create anothers with the new LC (Be sure that you no have persistentes volumes, e.g. databases, and that your pods are replicated)
- Detach Instance by In from the ASG, mark the checkbox for create a new instance, check with
kubectl get nodes
if the new node are running, then terminate the node that you detach from ASG. - Do It for all nodes.
- Detach Instance by In from the ASG, mark the checkbox for create a new instance, check with
- Update user-data for each instance is a alternative.
Update master nodes
- Update kubernetes manifests on s3 bucket
-
Download tar file
aws s3 cp s3://[BUCKET-URL]/manifests/etc.tar . tar -xvf etc.tar
-
Edit the k8s version in all files
grep 1.4 *.yml kube-apiserver.yml: image: quay.io/coreos/hyperkube:v1.4.0_coreos.0 kube-controller-manager.yml: image: quay.io/coreos/hyperkube:v1.4.0_coreos.0 kube-proxy.yml: image: quay.io/coreos/hyperkube:v1.4.0_coreos.0 kube-scheduler.yml: image: quay.io/coreos/hyperkube:v1.4.0_coreos.0
-
Compress and send file to s3
tar -cvf etc.tar *.yml aws s3 cp etc.tar s3://[BUCKET-URL]/manifests/etc.tar
-
- Update user-data for each node
- You need to stop instance by instance for edit k8s version on user data (Be sure that you not stop more than one instance per time).
- Start the instance
- Check health of etcd cluster with
etcdctl cluster-health
, if all nodes are healthy do it to other instance
- You need to stop instance by instance for edit k8s version on user data (Be sure that you not stop more than one instance per time).
@wellsie, please validate this.
@yagonobre thanks for your solution. it looks good, but has way to many manual fiddling, specially with the user-data
for each instance, way too much hassle for production clusters.
I found using global fleet
units for k8s services is much better way to make k8s upgrades.
Why is it not possible to replace etcd node and let it re-sync with the cluster?
@rokka-n I do it
Are you open to incorporating automated Kubernetes upgrades? If not is the purpose of this project a one-time setup and then you don't need this project anymore?
Yes open to automated upgrades 👍 On Fri, Feb 3, 2017 at 6:45 AM Phred [email protected] wrote:
Are you open to incorporating automated Kubernetes upgrades? If not is the purpose of this project a one-time setup and then you don't need this project anymore?
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/kz8s/tack/issues/75#issuecomment-277264186, or mute the thread https://github.com/notifications/unsubscribe-auth/AE6-kcseiksMO69J11ltoL7iVgtyYuPjks5rYz2PgaJpZM4KKAq4 .
Yes open to automated upgrades
Excellent! However without automated upgrades, is this intended to be a single-use project?