Improve detection of OMD whether it is running in a containerized environment
Issue
When running Checkmk as a K3s Pod and also mounting a tmpfs, a site update will fail with repeated error messages umount: /opt/omd/sites/SITE/tmp: umount failed: Operation not permitted..
This has been discovered trying to update from 2.0.0p27 to 2.1.0p9 on a RKE2 cluster (v1.21.8+rke2r1).
Reason
The reason lies within OMD omdlib.utils is_dockerized function. This function tries detecting whether it is running in a container by checking the two files /.dockerenv and /run/.containerenv. Depending on its output, the function _tmpfs_is_managed_by_node in omdlib.tmpfs decides, how to handle the tmpfs (u)mount.
As neither of the aforementioned files is present inside the checkmk container running on RKE2, is_dockerized returns False. This results in OMD expecting to being able to unmount the tmpfs, thus failing the update.
Possible workarounds
- Unmounting the tmpfs and redeploying checkmk, then doing the update without tmpfs mount. Afterwards, the tmpfs may be mounted again. This involves in a total of three deployments, though.
- Manually creating a file
/.dockerenvor/run/.containerenvinside the container or the image. This either requires a custom image build or doing it on every update (or by mounting a dummy file to one of the locations, that would be quite hacky, though)
Proposed solution
Instead of just looking at those two files, it would be possible to check for certain cgroups' existence in /proc/1/cgroup. This is the change that this pull request proposes.
I updated is_dockerized with logic to read /proc/1/cgroup and look for specific cgroups:
/kubepods-> Kubernetes-like deployments/docker-> Docker deployments/lxcLXC deployments
I am not sure, whether there is a better way to achieve this, or whether there are other cgroups that should be added. Also, I do not know whether this strictly is a bug or not. That's for you to decide ;)
We have other customers running checkmk on a k8s cluster who had no issues with an updates. They published their helm charts. Would those work for you?
Thank you for the link to the charts. We might actually be able to use them. However, the YAML specification for the checkmk deployment look quite the same as ours does at the moment. Specifically the way the tmpfs is mounted is exactly the same.
It would be interesting to know what flavour of Kubernetes they were/are running. On our RKE2/containerd the culprit is the missing files /.dockerenv and /run/.containerenv, that fails OMD's "in container" detection mechanism. (One could argue this is not an issue with OMD, but with the container runtime that is not creating these files, though...)
yeah seems a k3s issue. If cri-o is an option they as a runtime they mount the files, issue. Is there any official docs to know if the program is running in a k3s container? A quick google search could not find anything.
FYI I'm not opposed to including this change. I would just like to understand what is the most portable way to figure out if omd is started in a container.
Hard to find anything conclusive on that topic. Since we create the container ourselves we can set an environment variable to detect if we are in the container. When creating the image we can set a variable CONTAINER=True and check for that in OMD. That is independent of the container runtime.
Hard to find anything conclusive on that topic. [...]
This is what I discovered as well...
[...] When creating the image we can set a variable [...]
The env variable in the container would be an easy option, I guess. The choice of name would have to be "globally distinctive", though. E.g. someone might already have set a variable CONTAINER on their system, that could lead to conflicts.
[...] Is there any official docs to know if the program is running in a k3s container? [...]
I will ask around internally whether anyone has any insights into this.
So, the most promising feedback I got from my colleagues mentions /proc/1/sched. When using the first line, we would get init or systemd on non-container systems and anything else when running in a container.
However, a quick google leaves me sceptical whether this check alone actually is a reliable source of information. There is a PR in the systemd repository discussing exactly this: https://github.com/systemd/systemd/pull/17902.
They decided to check a number of files in /sys, as well as the env variable container (which seems to be neither present on RKE2/containerd nor docker 20.10), and the two files /.dockerenv and /run/.containerenv.
Since you are in control of the deployment, I am unsure whether this "all-in-one" approach is actually necessary, though.
The most portable solution to me seems that we include an environment variable CMK_CONTAINERIZED in the image and check for that in OMD. I would leave the checks for the files if someone wants to build their own images. Would that work for you?
Would that work for you?
Yes absolutely. Thank you 🙂
The dockerfile we use can be found under /docker. Can you test those changes yourself and check if they work for you?
You mean by adding the ENV variable, updating the detection in omdlib.utils, building the image and trying the update with the custom image? I will have a look at it 👍
Awesome thanks. Let me know if you have any problems.
Just a quick update: I was able to build both images (took quite some trial and error to set up the build environment ;) ). I should be able to provide further information by the end of this week.
After figuring out why the environment variable was lost during site update (set_environment in omdlib.main), I do now have a working version ready. The PR has already been updated with the new proposal.
With the current approach, the update completes successfully when running on our Kubernetes cluster with tmpfs mounted and doing an update to a new version. I have tested this with 2.0.0p28 (official build) to 2.1.0p11 (custom build with the updated discovery).
Note that during my testing I discovered some other minor inconveniences with a deployment on (our?) Kubernetes cluster, that are not addressed with this PR :
-
After an update, OMD tries
chowning the tmpfs mount path, which is not allowed (in our deployment at least). This results in the pod actually crashing after the update, yielding a restart. Only afterwards everything runs as expected.... rest of site update snipped ... Temporary filesystem already mounted ... rest of traceback snipped ... File "/omd/versions/2.1.0p11.cre/lib/python3/omdlib/main.py", line 367, in chown_tree os.chown(directory, uid, gid) PermissionError: [Errno 1] Operation not permitted: '/omd/sites/testsite/tmp' -
Another issue is OMD discovering the folder
lost+foundin the persistent volume for/omd/sitesas an empty site, resulting in error messages during site creation and updates. (existence oflost+founddepends on the underlying filesystem) E.g. site creation:... ERROR: Failed to read config /omd/sites/lost+found/etc/omd/site.conf of site lost+found. AGENT_RECEIVER_PORT port will possibly be allocated twice ERROR: Failed to read config /omd/sites/lost+found/etc/omd/site.conf of site lost+found. APACHE_TCP_PORT port will possibly be allocated twice ...
Thanks. I forwarded your inconveniences to our k8s team, they sound like we rely on some things that we shouldn't.
Thanks for this fix. I'll add this to 2.1 as well