vic
vic copied to clipboard
Feature Request: Enable systemd based Containers in VIC
User Statement
As a user I would like to deploy containers using systemd
inside the container, since this greatly simplifies writing Dockerfiles using already packaged software. With VIC-1.5 it is possible to use alternate Linux kernels (e.g. from CentOS), which makes this feature even more interesting since it would enable users to run e.g. the full CentOS stack in Container VMs. This is the reason for this feature request, which has beed discussed in the past, but never got implemented.
Details The CentOS systemd Container should run with VIC. The corresponding Dockerfile looks like this
FROM centos/systemd
MAINTAINER "Your Name"
RUN yum -y install httpd; yum clean all; systemctl enable httpd.service
EXPOSE 80
CMD ["/usr/sbin/init"]
and shows how you can avoid re-inventing the wheel. It would also allow to replace OVA deployments completely with VIC based Container VMs. The command
docker run --privileged --name httpd -v /sys/fs/cgroup:/sys/fs/cgroup:ro -p 80:80 -d httpd
should be replaceable with the following command in a VIC context
docker run --name httpd -p 80:80 -d httpd
i.e. you do not need privileged mode or mount hacks.
Adding some background, and hints for how to add this.
Prior to the custom ISO work we did use systemd to initialize /dev and then switchroot into the container filesystem with tether as pid1. With the custom ISO work we also had to support sysv init systems (no systemd present) so we now have the system-init script.
I cannot give a solid estimate for supporting a systemd based container because we’ve never gone through it in depth, but:
- Use systemd for the system init. This may be as simple as a custom ISO configuration that calls “exec /lib/systemd-systemd -system” once any cVM specific init has occurred.
- Launch tether, not as PID 1. a. This could be after starting systemd or via a systemd unit b. Tether unit tests do not run as PID1 so this should be viable without alteration c. May need to confirm that child exit codes are reaped correctly – this may require Linux 3.4 and up for the PR_SET_CHILD_SUBREAPER support (see lib/tether/tether_linux.go)
- If the container directly runs systemd then that may need to be replaced with “systemd-systemd -user” (don’t recall the exact argument name) if that's not automatically detected.
Things to consider:
- Do you start systemd before or after the switch_root to the container filesystem? a. systemd makes use of dbus so I highly recommend using the systemd mechanisms to do the switchroot as that should ensure systemd function moves over smoothly.
- What systemd unit files need to be present in the container image for systemd to function correctly. It may be necessary to copy parts of /etc/systemd, /run/systemd, and /usr/lib/systemd into /.tether and then bindmount them into the container rootfs. This is where the speculation really starts as I've never experimented with this part.
I do think this would be extremely useful work, and is a necessary pre-req to supporting kubelet running in a cVM if you want to be able to support Kubernetes-cluster-in-a-VCH, which I think is also extremely useful work.
thanks for input. Will also talk to a few customers about it.