improve the way of running transient scope for systemd-nspawn
https://github.com/kinvolk/kube-spawn/pull/328 was merged. But there was a similar PR https://github.com/kinvolk/kube-spawn/pull/257, which is unmerged. So we should create a new PR on top of the current master branch, to address the following issues:
-
We should use the option
--network-namespace-path=for systemd-nspawn, so that the container lifetime is not tied to the lifetime of the slice in which kube-spawn is running. -
We should also use shutdown and remove the network namespace when the cluster is stopped.
-
We could use
systemd-run --service-type=notifytogether withsystemd-nspawn --notify-ready=yesright now. We could then debug more easily withsystemctl status run-xxxx.service. -
Possibly use
systemd-run --unit=using a user-friendly generated name too. And--slice=machine.slice(see /usr/lib/systemd/system/[email protected]). -
And lastly, but it might require more refactoring so maybe in a future step: instead of syscall.ForkExec(), use
os.execso you can catch the stdout/stderr and exit status, and print those in case of errors. We would need to start the 3 nodes in parallel (in case of a 3-node cluster) and I don't know if the current code does this in parallel (hence the need for refactoring).