runc icon indicating copy to clipboard operation
runc copied to clipboard

Move integration tests to Ubuntu Focal

Open RenaudWasTaken opened this issue 4 years ago • 38 comments

Hello!

Following up on the requests of https://github.com/opencontainers/runc/pull/2229 Here I moved all the integration tests from busybox to debian. This involved the following three operations:

  • find and replace all occurrences of busybox by debian in the following three contexts:
    • teardown_busybox, setup_busybox
    • BUSYBOX_BUNDLE
    • runc run ./work-dir test_busybox
  • Generate the debian spec using runc rather than umoci
    • This had the implication that we needed to update the hooks test to set the container root to read-write (by default runc generates a spec with a read-only root)
  • Update some of the error messages that were slightly different with Debian (e.g: Permission denied) and error codes

/cc @cyphar @AkihiroSuda

Thanks for reviewing!

Signed-off-by: Renaud Gaubert [email protected]

RenaudWasTaken avatar Jun 24 '20 06:06 RenaudWasTaken

Hmm, I wasn't expecting debian not to have ps in the base image...

RenaudWasTaken avatar Jun 24 '20 07:06 RenaudWasTaken

This has bitten me in the past as well. And now is the part where I pitch openSUSE :wink: -- the opensuse/leap:15.1 image is smaller than the Debian one and has both ldconfig and ps in the base image.

cyphar avatar Jun 24 '20 07:06 cyphar

Hmm, I'll look into using the opensuse/leap image then!

RenaudWasTaken avatar Jun 24 '20 07:06 RenaudWasTaken

One other thing you'll need to update is the Go "integration" tests in libcontainer/integration -- they use the busybox image as well. It also wouldn't hurt to drop the hello image as well, which is also a slightly questionable thing we still have hanging around.

cyphar avatar Jun 24 '20 07:06 cyphar

Looks like opensuse works on all bats the integration tests. I'm hitting a small issue in the go integration tests related to groups:

--- FAIL: TestAdditionalGroups (0.26s)
utils_test.go:55: exec_test.go:487: unexpected error: container_linux.go:367: starting container process caused: setup user: Unable to find group plugdev

Will look into it a bit more tomorrow morning :) !

Update: Renaud re-discovered groups in Linux today....

It also wouldn't hurt to drop the hello image as well, which is also a slightly questionable thing we still have hanging around

I'm happy to drop that one :)

RenaudWasTaken avatar Jun 24 '20 07:06 RenaudWasTaken

Ah, right. Yeah you'll just need to pick a different group to run the user as (openSUSE doesn't have a plugdev group).

cyphar avatar Jun 24 '20 08:06 cyphar

Working through the last issues!

@cyphar it looks like I can't just rm -rf after a umoci unpack --rootless for the opensuse image. For some reason I need to chmod 755 the following dir: ./rootfs/var/lib/ca-certificates/".

Are you ok with that workaround until we investigate a bit more what is going on there?

e.g:

# sudo -HE -u rootless bash
$ opensuse="opensuse:3.11.6"
$ tmp=$(mktemp -d)
$ cd "$tmp"
$ skopeo copy docker://opensuse/leap:15.1 "oci:$opensuse"
$ umoci unpack --rootless --image "$opensuse" "./"
$ rm -rf ./rootfs
rm: cannot remove 'rootfs/var/lib/ca-certificates/pem/2fa87019.0': Permission denied
....
$ chmod -R 755 "./rootfs/var/lib/ca-certificates/"
$ rm -rf ./rootfs

RenaudWasTaken avatar Jun 25 '20 02:06 RenaudWasTaken

PR subject and commit message should change debian to opensuse

kolyshkin avatar Jun 29 '20 21:06 kolyshkin

I've been fighting with the CI for a few days now :D !

I think I'm almost done, the most significant issue that I've had to resolve is that it seems that just copying the rootfs, that umoci extracts, hangs in Vagrant (but not in the dockerfile). See an example here: https://travis-ci.org/github/opencontainers/runc/jobs/701939941

I tracked it down, through plain dumb "code comment" bisection and CI runs to the specific copy instruction: https://github.com/opencontainers/runc/blob/1b94395c06577b36bae4afd2fe5da229f7a03284/tests/integration/helpers.bash#L454

I suspect it's due to the opensuse image since that issue isn't showing up in master and wasn't showing up when using the debian image. Unfortunately I don't have a Vagrant capable machine so I haven't been able to thoroughly debug it (I suspect symlink shenanigans).

The workaround that I've employed is to unpack the rootfs (umoci unpack) for every test, but this has introduced a noticeable 2x-2.5x increase in the CI. e.g:

  • Fedora takes 50 minutes instead of 20 minutes
  • Go takes 20 minutes instead of 10 minutes
  • cgroup systemd takes 6 minutes instead of 3 minutes

Is this something that someone has already encountered or where there's an obvious solution I'm missing?

RenaudWasTaken avatar Jun 30 '20 03:06 RenaudWasTaken

Hmm so that's now to the point where the CI exceeds the time limit, I'll see if I manage to reproduce by some other way

RenaudWasTaken avatar Jun 30 '20 21:06 RenaudWasTaken

umoci shouldn't make that huge of an impact (though because it has to extract tar archives, it will be a bit slower). I'll look into what might be making it slow. But yeah, since we aren't testing umoci here it would be best to just replicate the existing caching done for busybox.

cyphar avatar Jun 30 '20 22:06 cyphar

cp -r "$DEBIAN_ROOTFS"/* "$DEBIAN_BUNDLE/"

Wild idea, maybe try running it under strace (right there on CI) to see what is going on?

kolyshkin avatar Jun 30 '20 22:06 kolyshkin

Looks like a cp -R -P $cache/* "$1" yields a different result, I'm pretty sure we are dealing with some symlinks shenanigans now :D !

RenaudWasTaken avatar Jun 30 '20 23:06 RenaudWasTaken

cp -a is more fool-proof than cp -P (and -a implies -R) in this respect. I think the issue is that you're copying /* which can result in symlinks being resolved at the top-level of the directory. -a will preserve that.

But it would just be simpler to do umoci unpack $CACHED_BUNDLE and then do cp -a $CACHED_BUNDLE $BUNDLE without using * (so you don't run into spaces-in-filenames or too-many-arguments problems).

cyphar avatar Jul 01 '20 00:07 cyphar

Looks like CI is green and no significant increase with this latest caching commit!

RenaudWasTaken avatar Jul 01 '20 00:07 RenaudWasTaken

cp -a is more fool-proof than cp -P (and -a implies -R) in this respect. I think the issue is that you're copying /* which can result in symlinks being resolved at the top-level of the directory. -a will preserve that.

I'll update the last commit, when I was copying stuff around, symlinks were definitely not at the root though.

But it would just be simpler to do umoci unpack $CACHED_BUNDLE and then do cp -a $CACHED_BUNDLE $BUNDLE without using * (so you don't run into spaces-in-filenames or too-many-arguments problems).

If I don't cache the result of umoci unpack I get a pretty high CI time, see https://travis-ci.org/github/opencontainers/runc/builds/703395550?utm_source=github_status&utm_medium=notification Which is the direct result of the first commit: https://github.com/opencontainers/runc/pull/2486/commits

Let me update the *, I was testing things around :)

Wild idea, maybe try running it under strace (right there on CI) to see what is going on?

Hmm yeah that would have worked too :D !

RenaudWasTaken avatar Jul 01 '20 00:07 RenaudWasTaken

If I don't cache the result of umoci unpack I get a pretty high CI time

I was trying to say (a little clumsily) "instead of caching just bundle/rootfs, cache the entire bundle so you don't need to copy *" not that we shouldn't cache at all.

cyphar avatar Jul 01 '20 00:07 cyphar

I was trying to say (a little clumsily) "instead of caching just bundle/rootfs, cache the entire bundle so you don't need to copy *" not that we shouldn't cache at all.

It looks like we have the same idea in mind :D ! I think the latest version reflects that, let me know if I'm completely misreading your message and the code doesn't reflect that.

RenaudWasTaken avatar Jul 01 '20 00:07 RenaudWasTaken

I'm tracking the performance of umoci in opencontainers/umoci#339. However, unless I'm mistaken you should see a ~38% performance boost if you just do umoci raw unpack (which disables go-mtree generation and doesn't generate config.json either).

cyphar avatar Jul 02 '20 01:07 cyphar

I managed to get caching running yesterday :) ! I don't think there is any significant difference with the master CI now! This is ready for review! Let me know if I need to change anything :)

RenaudWasTaken avatar Jul 02 '20 03:07 RenaudWasTaken

Does opensuse work on s390x and ppc64le?

AkihiroSuda avatar Jul 02 '20 03:07 AkihiroSuda

Debian seems the best for multi-arch testing

AkihiroSuda avatar Jul 02 '20 03:07 AkihiroSuda

If we use opensuse/tumbleweed then s390x and ppc64le are supported -- and I'd suggest using tumbleweed anyway. opensuse/leap supports ppc64le, but not s390x (this is because Leap doesn't build on s390x but I'll ask the Leap folks whether they plan to change this).

cyphar avatar Jul 02 '20 04:07 cyphar

Where is the list of the supported architectures? Couldn't find on google :sob:

AkihiroSuda avatar Jul 02 '20 04:07 AkihiroSuda

Hmm there's only a latest tag available for tumbleweed, is that ok? https://hub.docker.com/r/opensuse/tumbleweed

RenaudWasTaken avatar Jul 02 '20 04:07 RenaudWasTaken

Yeah that's fine @RenaudWasTaken -- Tumbleweed is a rolling-release distribution.

cyphar avatar Jul 02 '20 04:07 cyphar

@AkihiroSuda

Where is the list of the supported architectures? Couldn't find on google :sob:

I only just found this out, but you can find it by looking at the tags tab on Docker Hub and you can see which architectures we build for here.

cyphar avatar Jul 02 '20 04:07 cyphar

Will openSUSE support RISC-V and MIPS?

AkihiroSuda avatar Jul 02 '20 05:07 AkihiroSuda

Switched to tumbleweed

RenaudWasTaken avatar Jul 02 '20 05:07 RenaudWasTaken

So umount is not part of tumbleweed! Update: Nevermind, just need to update the test , it's called unmount

RenaudWasTaken avatar Jul 02 '20 05:07 RenaudWasTaken