podman
podman copied to clipboard
On MacOS with krunkit installed, the `run basic podman commands [It] Volume ops` test fails
Issue Description
On MacOS with krunkit installed, the run basic podman commands [It] Volume ops test fails
Steps to reproduce the issue
Steps to reproduce the issue
- On MacOS,
brew tap slp/krunkit,brew install krunkit. - Remove conflicting symlink
rm -vf /opt/homebrew/bin/vfkit(ref. PR) brew tap cfergeau/crcthenbrew install vfkit- Clone the podman repo
export CONTAINERS_MACHINE_PROVIDER="libkrun"make localmachine
Describe the results you received
Describe the results you expected
Test should pass
podman info output
Ref. related build CI task: https://cirrus-ci.com/task/5136030032986112
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
Yes
Additional environment details
Mac setup PR: Not in production use at the time this issue was open.
Additional information
Happens every run.
@slp FYI
Testing locally I see that mounts are supported and the default mounts should be the same with applehv so I don't see a reason why this should fail with krun
/Users/cevichTesting-0-worker/ci/task-4648064202309632/bin/darwin/podman -r run -v /private/tmp/ci/ginkgo1303331317:/test:Z quay.io/libpod/alpine_nginx ls /test/attr-test-file
Trying to pull quay.io/libpod/alpine_nginx:latest...
Getting image source signatures
Copying blob sha256:d2c7362ca710ad35a846a34571a7c3450ea3cce04efcbcb4d3af276eda154ade
Copying blob sha256:df9b9388f04ad6279a7410b85cedfdcb2208c0a003da7ab5613af71079148139
Copying blob sha256:71895e83ea49901b7b752bbf3ca19a54148a5f4ab5fdff3dca9bcd59d44c59e3
Copying config sha256:ecea49d99daa5bd62ebaef1338f6bc4c948bf2651b139160404f9c1c48fcd85c
Writing manifest to image destination
WARNING: image platform (linux/amd64) does not match the expected platform (linux/arm64)
Error: statfs /private/tmp/ci/ginkgo1303331317: no such file or directory
How is /private/tmp/ci/ginkgo1303331317 created? Is the test code publicly available somewhere?
@slp on podman main, this is what I am using.
$ export CONTAINERS_MACHINE_PROVIDER=libkrun
$ TMPDIR=/private/tmp make localmachine FOCUS="Volume ops"
But this works for me so there is something special with the CI setup and likely not related to krun.
Possible the flake was not fixed or somehow special for libkrun: https://github.com/containers/podman/issues/22569
Possible the flake was not fixed or somehow special for libkrun
Correct, I've never seen this during my recent libkrun testing, when CONTAINERS_MACHINE_PROVIDER=. With libkrun, it's not a flake, it fails 100% of the time.
In case it matters, in this CI environment:
- We're running as a regular user w/ any admin permissions.
$TMPDIR=/private/tmp/ciand$HOME=/home/$USER/ci.- A local SSD volume (root) is mounted on both
$TMPDIRand/home/$USER.
Some details:
cevichTesting-0:~ ec2-user$ id cevichTesting-0-worker
uid=502(cevichTesting-0-worker) gid=20(staff) groups=20(staff),12(everyone),61(localaccounts),701(com.apple.sharepoint.group.1),100(_lpoperator)
cevichTesting-0:~ ec2-user$ stat /private/tmp/ci
File: /private/tmp/ci
Size: 128 Blocks: 0 IO Block: 4096 directory
Device: 1,31 Inode: 2 Links: 4
Access: (1770/drwxrwx--T) Uid: ( 502/cevichTesting-0-worker) Gid: ( 20/ staff)
Access: 2024-07-17 13:05:04.288840817 +0000
Modify: 2024-07-17 13:05:04.487994354 +0000
Change: 2024-07-17 13:05:04.487994354 +0000
Birth: 2024-07-16 19:22:41.627913185 +0000
cevichTesting-0:~ ec2-user$ stat /Users/cevichTesting-0-worker/
File: /Users/cevichTesting-0-worker/
Size: 224 Blocks: 0 IO Block: 4096 directory
Device: 1,32 Inode: 2 Links: 7
Access: (0750/drwxr-x---) Uid: ( 502/cevichTesting-0-worker) Gid: ( 20/ staff)
Access: 2024-07-16 19:22:43.015498650 +0000
Modify: 2024-07-16 19:40:44.975385767 +0000
Change: 2024-07-16 19:40:44.975385767 +0000
Birth: 2024-07-16 19:22:43.015498650 +0000
How is /private/tmp/ci/ginkgo1303331317 created? Is the test code publicly available somewhere?
I believe it's created in the test here:
https://github.com/containers/podman/blob/36bab759b25621ed459ed3c662aa70e27e2e90a6/pkg/machine/e2e/basic_test.go#L65
@cevich I think we're going to need some debugging to dig deeper into this issue. Is it possible to connect to the CI machine to run some tests? It'd be interesting doing a manual podman machine init && podman machine start, and then connecting to the VM with podman machine ssh and checking if the volumes /private and /Users are exposed to the guest.
Is it possible to connect to the CI machine
Yes, and in fact I'm using a Mac that's already isolated from the rest of our system. @slp I messaged you on slack.
Thanks to @cevich help, we were able to debug this issue. https://github.com/containers/libkrun/pull/209 fixes this.
Thanks @slp for "getting your hands dirty" and figuring it out.
FYI- I'll be off on PTO next week, so won't be enabling libkrun testing in Podman until I return. There are changes needed on the Mac's used for CI and I don't want to risk breaking something while I'm away :wink:
A friendly reminder that this issue had no activity for 30 days.
@slp Is there a new krunkit version with the fix we can use?
@slp Is there a new krunkit version with the fix we can use?
Yes, it also includes the ability to increase the SHM window of virtio-gpu for running larger AI models. We're testing it now and we'll send a PR to podman by the end of the week.