Having overlayfs mounted may not be a good indication of running in a container
For example, docker seems to mount overlay2 filesystems in the host namespace:
~ findmnt
...
├─/var/lib/docker flutterbat-root/var/docker zfs rw,nodev,noatime,xattr,posixacl
│ └─/var/lib/docker/overlay2/c35c2e7f2a781b242319e96aeba445ba29ac71303050989d5033d28a900ec49c/merged
│ overlay overlay rw,relatime,lowerdir=/var/lib/docker/overlay2/l/ENMSXZB2IW43DG5PIKATMRUHRZ:/var/lib/docker/overlay2/l/EAM2LAHE6SJHXQ2UOLWNROJSE5, ...
...
A better check might be to check whether the root file system is using overlay or lxcfs, rather than any mount.
https://unix.stackexchange.com/a/644209 seems quite interesting, since (modern) containers run in separate Linux namespaces.
We should probably also formulate and document a (htop) definition of container:
- a curated list of supported container engines (docker, lxc, podman, ...)
- usage of Linux kernel features (pid/user/cgroup namespace, ...) (might include sandboxes)
- specific PID 1 process
- ?
BAsed on some off-site discussion, we found, that we should re-work the ways in which we detect containers. Both for privileged users and unprivileged ones. Further we should see if we can leverage different sources based on which resources we need to process anyway (cf. #1222 re performance).
Linux namespaces can be distinguished by their id from the inode number of /proc/<pid>/ns/pid; but that file is unfortunately only readable by root.
@cgzones oh, that's just jarred an old memory - kernel commit e4bc33245124db69b74a6d853ac76c2976f472d5 might be able to help us here - it adds (similar/same?) namespace pid info to /proc/PID/status which is more widely readable
Yes, that's what triggered the performance issues from #1222 as reading the status file seems to be quite slow. Not an issue if we have to read that file anyways, but we should have a fallback available for when we are not reading that file for some other information.
But if we fetch its contents we should clearly go that route (that's why my remark on priv vs. unpriv users); i.e. try to avoid reading /proc/<pid>/status and try to get its information from elsewhere if that nsid is the only field we are reading for some reason.