improve the way of running transient scope for systemd-nspawn

Open dongsupark opened this issue 7 years ago • 0 comments

https://github.com/kinvolk/kube-spawn/pull/328 was merged. But there was a similar PR https://github.com/kinvolk/kube-spawn/pull/257, which is unmerged. So we should create a new PR on top of the current master branch, to address the following issues:

We should use the option --network-namespace-path= for systemd-nspawn, so that the container lifetime is not tied to the lifetime of the slice in which kube-spawn is running.
We should also use shutdown and remove the network namespace when the cluster is stopped.
We could use systemd-run --service-type=notify together with systemd-nspawn --notify-ready=yes right now. We could then debug more easily with systemctl status run-xxxx.service.
Possibly use systemd-run --unit= using a user-friendly generated name too. And --slice=machine.slice (see /usr/lib/systemd/system/[email protected]).
And lastly, but it might require more refactoring so maybe in a future step: instead of syscall.ForkExec(), use os.exec so you can catch the stdout/stderr and exit status, and print those in case of errors. We would need to start the 3 nodes in parallel (in case of a 3-node cluster) and I don't know if the current code does this in parallel (hence the need for refactoring).

Nov 28 '18 17:11 dongsupark