runc icon indicating copy to clipboard operation
runc copied to clipboard

Support Id mapped mounts for shared volumes

Open AlexeyPerevalov opened this issue 2 years ago • 10 comments

This pull request adds support for id map mount feature for shared volumes. For rootfs this is already implemented in https://github.com/containerd/containerd/pull/5890.

Both commit has a code in common, which should be placed on the same place in the future.

There are two possibilities to use mount_setattr syscall with fd returned by open_tree or with fd returned by fsmount syscall. This PR uses open_tree, since this approach was used in examples, as well as in crun.

As a fallback scenario, mount syscall still used for bind mount. Open_tree syscall as well as mount_setattr are quite new syscall, to use it latest kernel version is required. In this patch set, kernel version is checked, but as an alternative approach we could just check syscall existence, any suggestions are welcome. Also this PR includes specs-go which could be moved to separate PR.

AlexeyPerevalov avatar Mar 23 '22 12:03 AlexeyPerevalov

OK, this should be a draft / DNM until changes to runtime-spec and x/sys/unix are landed.

kolyshkin avatar Mar 23 '22 17:03 kolyshkin

In this patch set, kernel version is checked

This obviously won't work because of e.g. RHEL kernels that backport lots of stuff.

but as an alternative approach we could just check syscall existence, any suggestions are welcome.

Right, a simple test checking if a syscall is available (perhaps calling it with wrong arguments and expecting EINVAL in return is sufficient), when wrap this into once.Do and use it freely.

As an example, see how we check if openat2(2) is available here: https://github.com/opencontainers/runc/blob/c4c48896ebaa6a8ea1ede798f79ba02092f0c894/libcontainer/cgroups/file.go#L79-L98

kolyshkin avatar Mar 23 '22 17:03 kolyshkin

This PR depends on golang's x/sys library modifications: https://go-review.googlesource.com/c/sys/+/397095 https://go-review.googlesource.com/c/sys/+/397094 and depends on https://github.com/opencontainers/runtime-spec/pull/1143. Patches for x/sys are merged, just need to wait for appropriate version release of lib. runtime-spec change is stuck on review, how to initiate it?

AlexeyPerevalov avatar Apr 13 '22 07:04 AlexeyPerevalov

@AlexeyPerevalov the runtime spec PR was already merged (https://github.com/opencontainers/runtime-spec/pull/1143). It would be great if you can finish this PR now :)

rata avatar Jul 12 '22 16:07 rata

runtime-spec changes are not yet tagged, latest tag is v1.0.2 the same with x/sys golang lib, not yet published

AlexeyPerevalov avatar Jul 14 '22 10:07 AlexeyPerevalov

@AlexeyPerevalov No need to wait for a runtime-spec tag, IMHO. We are not using a tag already: https://github.com/opencontainers/runc/blob/a776ec9c90d7a67a924211d6cb1b38620afa6f49/go.mod#L15

For the x/sys pkg, I think it makes sense to define ourselves the constants if that is not going to be released soon. Do you happen to know when they plan to release that with the constants we need? If there is no visibility into that, I don't know what others think, but I would guess it seems reasonable to use some tmp file with the constants defined here. We can switch whenever that becomes available.

rata avatar Jul 14 '22 13:07 rata

runtime-spec changes are not yet tagged, latest tag is v1.0.2

We are already using a sha for it (due to the lack of runtime-spec maintainers I guess).

the same with x/sys golang lib, not yet published

This one is never tagged, i.e. all users are just pin it to a particular sha.

kolyshkin avatar Jul 21 '22 00:07 kolyshkin

in centos-stream-9 two tests are failed: TestUsernsCheckpoint TestCheckpoint

Tests related to checkpoint restore failed due to segfault in criu, I reproduced it on Centos Stream 8 (I don't have Stream 9), but It is not reproducible on Ubuntu 20.04, with working criu.

AlexeyPerevalov avatar Jul 29 '22 13:07 AlexeyPerevalov

@AlexeyPerevalov but it seems all test have passed... Was that maybe before the latest fix merged yesterday to fix the CI? (although IIRC it was only about one of those test, not both)

rata avatar Jul 29 '22 13:07 rata

@AlexeyPerevalov friendly ping? It would be great to have this in for runc 1.2.

(otherwise, we will have to wait for another runc release, that takes a while, some more time for stable versions of high level container runtimes to include it, etc. :))

rata avatar Aug 09 '22 15:08 rata

I really wish to include this functionality in the next runc release. This is required for our userns work in Kubernetes. I'll try to make this work.

If anyone has any thoughts on the approach to take, please let me know :)

rata avatar Jan 27 '23 17:01 rata

I really wish to include this functionality in the next runc release. This is required for our userns work in Kubernetes. I'll try to make this work.

If anyone has any thoughts on the approach to take, please let me know :)

Thanks @rata , could you open a new PR to carry this PR?

AkihiroSuda avatar Jan 30 '23 18:01 AkihiroSuda

@AkihiroSuda Thanks! Yes, I'm working on it and I'll open a new PR in the next days.

I need to start from scratch and do most things in nsenter/nsexec, so nothing really to carry from here. And quite more involved, that is why I haven't opened it yet :)

rata avatar Jan 31 '23 10:01 rata

Opened this draft PR with an approach that works: https://github.com/opencontainers/runc/pull/3717

rata avatar Feb 01 '23 11:02 rata