terraform-kubernetes-installer icon indicating copy to clipboard operation
terraform-kubernetes-installer copied to clipboard

Configure the bootstrap processes on etcd, master and worker node VM as daemon process

Open srirg opened this issue 7 years ago • 4 comments

If there is a restart of the VM instances, the processes on the etcd, master and worker VMs are not coming up. we should configure the processes that we start during the bootstrap process on the etcd, master and worker VMs as daemon process.

srirg avatar Dec 07 '17 17:12 srirg

OK I think I may have fixed this in a unrelated commit. So the docker service was being started but not being enabled. (Sorry found this and meant to split it out but went on holiday and forgot)

https://github.com/oracle/terraform-kubernetes-installer/pull/75/files#diff-e91822387365c76f9ce415a2ef3aa164R24

Can you please test again as they should all come up now.

garthy avatar Jan 04 '18 15:01 garthy

Sure, we will verify. The fix is just for etcd right?

srirg avatar Jan 04 '18 16:01 srirg

I have tested a reboot of the etcd node but the etcd docker service didn't come back up after the restart.

docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES d064acd5e8a5 quay.io/coreos/etcd:v3.2.2 "/usr/local/bin/et..." 2 hours ago Exited (255) About a minute ago gifted_swartz

so I tried manually starting it up 2018-01-04 16:16:43.482445 W | etcdmain: no data-dir provided, using default data-dir ./etcd-ad1-0.etcd 2018-01-04 16:16:43.482540 N | etcdmain: the server is already initialized as member before, starting as etcd member... 2018-01-04 16:16:43.482703 I | embed: listening for peers on http://0.0.0.0:2380 2018-01-04 16:16:43.482775 I | embed: listening for client requests on 10.0.20.2:2379 2018-01-04 16:16:43.482822 I | embed: listening for client requests on 127.0.0.1:2379 2018-01-04 16:16:43.483020 C | etcdmain: cannot access data directory: open etcd-ad1-0.etcd/.touch: permission denied

so we need to specify the etcd data directory as well for it to work, so I specified the etcd_iscsi_volume_create=true during creation but I don't see the docker etcd coming up with the restart of the etcd node instance.

so I tried again manually starting it up but now I get this 2018-01-04 18:41:44.182305 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4 2018-01-04 18:41:44.182324 I | etcdmain: advertising using detected default host "10.0.20.2" 2018-01-04 18:41:44.182338 W | etcdmain: no data-dir provided, using default data-dir ./etcd-ad1-0.etcd 2018-01-04 18:41:44.182397 C | etcdmain: error listing data dir: etcd-ad1-0.etcd

srirg avatar Jan 04 '18 16:01 srirg

OK this is due to selinux which we disable reenableing on reboot.

This should fix it.

https://github.com/oracle/terraform-kubernetes-installer/pull/84

garthy avatar Jan 05 '18 14:01 garthy