dkms icon indicating copy to clipboard operation
dkms copied to clipboard

dkms does not appear to be chroot safe

Open johnnzz opened this issue 4 years ago • 4 comments

It appears that dkms does not work correctly in a chroot environment. You can see an example of this by simply doing:

smw:/image_roots # chroot my_image_root/ smw:/ # dkms status /usr/sbin/dkms: line 1980: /dev/fd/62: No such file or directory /usr/sbin/dkms: line 1912: /dev/fd/62: No such file or directory smw:/ #

The reason this is important is many HPC cluster management systems build images off-line in a chroot prior to deployment. And we have been encountering packages that use dkms in the rpm post install scriptlets and these packages fail to install in these contexts.

A lot of these systems use zypper under the covers, like:

smw:/ # zypper --root /image_roots/my_image_root install rocm-dkms

It seems the issue is that some of the functions in dkms rely on on the shell mechanism described here:

https://unix.stackexchange.com/questions/156084/why-does-process-substitution-result-in-a-file-called-dev-fd-63-which-is-a-pipe#:~:text=So%20%2Fdev%2Ffd%2F63,output%20of%20your%20ls%20call.

If those functions could be refactored not to use that mechanism, it seems like dkms should work in the zypper --root context, and if that worked, it should work for this whole class of cluster management systems.

johnnzz avatar Feb 18 '21 22:02 johnnzz

Here is an example diff for a workaround we used:

flubber-smw:/var/opt/cray/imps/image_roots # diff gsetterhol_NVIDIA-699/usr/sbin/dkms gsetterhol_NVIDIA-699/usr/sbin/dkms.bak
23,24d22
< set -x
< 
1915,1917c1913
<     local mvka m v k a kern status tmp_file
<     tmp_file=$(mktemp)
<     module_status_weak "$@" > $tmp_file
---
>     local mvka m v k a kern status
1921,1922c1917
<     done < $tmp_file
<     rm $tmp_file
---
>     done < <(module_status_weak "$@")
1986,1988c1981
<     local status mvka m v k a tmp_file
<     tmp_file=$(mktemp)
<     module_status "$@" > $tmp_file
---
>     local status mvka m v k a
2001,2002c1994
<     done < $tmp_file
<     rm $tmp_file
---
>     done < <(module_status "$@")
flubber-smw:/var/opt/cray/imps/image_roots # 

This gets status working in a chroot w/o /dev/, /sys/, and so forth. A better solution might be to use local variables (e.g. no temp file), but hopefully this can help to get started.

gts-cray avatar Apr 12 '21 18:04 gts-cray

@gts-cray can you make that a MR + ideally some tests?

Glancing at the latest code - we have a mktemp_or_die helper, although more importantly it's being used in a number of places - safe_source() to read the config file, around rpm_safe_upgrade, checking status, making a tarball, extracting/importing from tarball, dummy check we can write to /tmp.

So as-is the diff pasted is very incomplete. I'm more concerned that that w/o any tests, we'll end up breaking it all the time.

evelikov avatar Jun 06 '23 16:06 evelikov

It looks like recently in commit 5111f9cfe56cd357d81d7471fd090459bf905f99 additional checks were put in that definitely don't solve this but give a warning instead. @evelikov, the most obvious automated test for this would need to start a container or a chroot environment, requiring some degree of privilege when running the tests (or at least a software stack supporting user namespaces).

An alternative if the bash process substitution ( <( ... ) ) is the only problem would just be a test that validates that process substitution is not present anywhere in the codebase.

dmjacobsen avatar Jun 12 '23 13:06 dmjacobsen

@dmjacobsen thanks for pointing that commit - I already had a look at it while it was proposed. Our test suite is based on Github Actions, which uses docker + containers under the hood.

Seems like you have some ideas already, so if you can put those in terms of PR that would be great.

evelikov avatar Jun 12 '23 13:06 evelikov