CI: runc cgroup error msg flake
[+0588s] not ok 561 copy-file-relative-context-dir
[+0588s] # (from function `expect_line_count' in file ./helpers.bash, line 591,
[+0588s] # in test file ./copy.bats, line 597)
[+0588s] # `expect_line_count 1' failed
[+0588s] # /var/tmp/go/src/github.com/containers/buildah/tests /var/tmp/go/src/github.com/containers/buildah/tests
[+0588s] # # [checking for: docker.io/library/busybox]
[+0588s] # # [restoring from cache: /tmp/bats-run-lPAMEB/suite/buildah-image-cache / docker.io/library/busybox]
[+0588s] # Getting image source signatures
[+0588s] # Copying blob sha256:9758c28807f21c13d05c704821fdd56c0b9574912f9b916c65e1df3e6b8bc572
[+0588s] # Copying config sha256:f0b02e9d092d905d0d87a8455a1ae3e9bb47b4aa3dc125125ca5cd10d6441c9f
[+0588s] # Writing manifest to image destination
[+0588s] # # /var/tmp/go/src/github.com/containers/buildah/tests/./../bin/buildah from --quiet --signature-policy /var/tmp/go/src/github.com/containers/buildah/tests/./policy.json busybox
[+0588s] # busybox-working-container
[+0588s] # # /var/tmp/go/src/github.com/containers/buildah/tests/./../bin/buildah copy --contextdir /tmp/buildah_tests.ishht1/context busybox-working-container test_file /opt/
[+0588s] # 42145a076d5e72262bea80733dac7785067a42f76c7b45a5a047f70e4403440f
[+0588s] # # /var/tmp/go/src/github.com/containers/buildah/tests/./../bin/buildah run busybox-working-container ls -1 /opt/
[+0588s] # test_file
[+0588s] # time="2025-06-04T04:16:33-05:00" level=error msg="seek /sys/fs/cgroup/system.slice/runc-buildah-buildah4055347767.scope/cgroup.freeze: no such device"
[+0588s] # #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
[+0588s] # #| FAIL: buildah run busybox-working-container ls -1 /opt/
[+0588s] # #| Expected 1 lines of output, got 2
[+0588s] # #| Output was:
[+0588s] # #| >test_file
[+0588s] # #| >time="2025-06-04T04:16:33-05:00" level=error msg="seek /sys/fs/cgroup/system.slice/runc-buildah-buildah4055347767.scope/cgroup.freeze: no such device"
[+0588s] # #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
https://api.cirrus-ci.com/v1/task/5992802590392320/logs/integration_test.log
That one's cropping up frequently.
This is a really bad one flake. I rerun the test on my PR 5 times now still failing.
time="2025-06-04T04:16:33-05:00" level=error msg="seek /sys/fs/cgroup/system.slice/runc-buildah-buildah4055347767.scope/cgroup.freeze: no such device"
Looking at code I suspect that the root cause is the cgroup reading here https://github.com/opencontainers/cgroups/blob/b970779131d3e4540132ccfb16dc49890491f8d5/fs2/freezer.go#L53-L71
I guess the issue is that the cgroup was deleted after the open but before the seek call. If open doesn't error on ENODEV then maybe the seek/read shouldn't either. Not sure why the code seeks at all since we just opened the file that seems like an unnecessary syscall.
@kolyshkin Any chance you could have a look at this? crun seem to handle this per https://github.com/containers/crun/pull/539/commits/c6bd3143e2434b4ee4163e37045595bb6298090c I believe that is why we don't see the issue there I guess.
looks like it is being tracked in https://github.com/opencontainers/runc/issues/4798
@nalind This is flaking on basically every buildah PR I look at. Should we revert the runc testing here, at least until this flake gets sorted out in runc?
Yeah, I don't have knowledge of when it's going to be resolved.
Being fixed by https://github.com/opencontainers/cgroups/pull/25, will do my best to to fast-track it
#6286 proposes moving those test tasks to their own non-blocking groups for the meantime.
runc pr: https://github.com/opencontainers/runc/pull/4805
This is now merged into runc main via https://github.com/opencontainers/runc/pull/4808. Guess this one can be closed if there are no new failures?
@kolyshkin Thanks but we need to wait until runc is released with this fix and then but into the distribution packages so we can update them in our CI env. As long as it still flakes in CI we should keep this open.
Sorry, I thought you're testing runc git HEAD. Will try to fix in in all supported branches.
A friendly reminder that this issue had no activity for 30 days.