podman icon indicating copy to clipboard operation
podman copied to clipboard

On MacOS with krunkit installed, the `run basic podman commands [It] Volume ops` test fails

Open cevich opened this issue 1 year ago • 13 comments

Issue Description

On MacOS with krunkit installed, the run basic podman commands [It] Volume ops test fails

Steps to reproduce the issue

Steps to reproduce the issue

  1. On MacOS, brew tap slp/krunkit, brew install krunkit.
  2. Remove conflicting symlink rm -vf /opt/homebrew/bin/vfkit (ref. PR)
  3. brew tap cfergeau/crc then brew install vfkit
  4. Clone the podman repo
  5. export CONTAINERS_MACHINE_PROVIDER="libkrun"
  6. make localmachine

Describe the results you received

Example annotated log

Describe the results you expected

Test should pass

podman info output

Ref. related build CI task: https://cirrus-ci.com/task/5136030032986112

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

Mac setup PR: Not in production use at the time this issue was open.

Additional information

Happens every run.

cevich avatar Jul 16 '24 20:07 cevich

@slp FYI

rhatdan avatar Jul 17 '24 11:07 rhatdan

Testing locally I see that mounts are supported and the default mounts should be the same with applehv so I don't see a reason why this should fail with krun

Luap99 avatar Jul 17 '24 11:07 Luap99

/Users/cevichTesting-0-worker/ci/task-4648064202309632/bin/darwin/podman -r run -v /private/tmp/ci/ginkgo1303331317:/test:Z quay.io/libpod/alpine_nginx ls /test/attr-test-file
  Trying to pull quay.io/libpod/alpine_nginx:latest...
  Getting image source signatures
  Copying blob sha256:d2c7362ca710ad35a846a34571a7c3450ea3cce04efcbcb4d3af276eda154ade
  Copying blob sha256:df9b9388f04ad6279a7410b85cedfdcb2208c0a003da7ab5613af71079148139
  Copying blob sha256:71895e83ea49901b7b752bbf3ca19a54148a5f4ab5fdff3dca9bcd59d44c59e3
  Copying config sha256:ecea49d99daa5bd62ebaef1338f6bc4c948bf2651b139160404f9c1c48fcd85c
  Writing manifest to image destination
  WARNING: image platform (linux/amd64) does not match the expected platform (linux/arm64)
  Error: statfs /private/tmp/ci/ginkgo1303331317: no such file or directory

How is /private/tmp/ci/ginkgo1303331317 created? Is the test code publicly available somewhere?

slp avatar Jul 17 '24 12:07 slp

@slp on podman main, this is what I am using.

$ export CONTAINERS_MACHINE_PROVIDER=libkrun
$ TMPDIR=/private/tmp make localmachine FOCUS="Volume ops"

But this works for me so there is something special with the CI setup and likely not related to krun.

Possible the flake was not fixed or somehow special for libkrun: https://github.com/containers/podman/issues/22569

Luap99 avatar Jul 17 '24 12:07 Luap99

Possible the flake was not fixed or somehow special for libkrun

Correct, I've never seen this during my recent libkrun testing, when CONTAINERS_MACHINE_PROVIDER=. With libkrun, it's not a flake, it fails 100% of the time.

cevich avatar Jul 17 '24 13:07 cevich

In case it matters, in this CI environment:

  • We're running as a regular user w/ any admin permissions.
  • $TMPDIR=/private/tmp/ci and $HOME=/home/$USER/ci.
  • A local SSD volume (root) is mounted on both $TMPDIR and /home/$USER.

Some details:

cevichTesting-0:~ ec2-user$ id cevichTesting-0-worker
uid=502(cevichTesting-0-worker) gid=20(staff) groups=20(staff),12(everyone),61(localaccounts),701(com.apple.sharepoint.group.1),100(_lpoperator)
cevichTesting-0:~ ec2-user$ stat /private/tmp/ci
  File: /private/tmp/ci
  Size: 128             Blocks: 0          IO Block: 4096   directory
Device: 1,31    Inode: 2           Links: 4
Access: (1770/drwxrwx--T)  Uid: (  502/cevichTesting-0-worker)   Gid: (   20/   staff)
Access: 2024-07-17 13:05:04.288840817 +0000
Modify: 2024-07-17 13:05:04.487994354 +0000
Change: 2024-07-17 13:05:04.487994354 +0000
 Birth: 2024-07-16 19:22:41.627913185 +0000
cevichTesting-0:~ ec2-user$ stat /Users/cevichTesting-0-worker/
  File: /Users/cevichTesting-0-worker/
  Size: 224             Blocks: 0          IO Block: 4096   directory
Device: 1,32    Inode: 2           Links: 7
Access: (0750/drwxr-x---)  Uid: (  502/cevichTesting-0-worker)   Gid: (   20/   staff)
Access: 2024-07-16 19:22:43.015498650 +0000
Modify: 2024-07-16 19:40:44.975385767 +0000
Change: 2024-07-16 19:40:44.975385767 +0000
 Birth: 2024-07-16 19:22:43.015498650 +0000

cevich avatar Jul 17 '24 14:07 cevich

How is /private/tmp/ci/ginkgo1303331317 created? Is the test code publicly available somewhere?

I believe it's created in the test here:

https://github.com/containers/podman/blob/36bab759b25621ed459ed3c662aa70e27e2e90a6/pkg/machine/e2e/basic_test.go#L65

cevich avatar Jul 17 '24 14:07 cevich

@cevich I think we're going to need some debugging to dig deeper into this issue. Is it possible to connect to the CI machine to run some tests? It'd be interesting doing a manual podman machine init && podman machine start, and then connecting to the VM with podman machine ssh and checking if the volumes /private and /Users are exposed to the guest.

slp avatar Jul 17 '24 15:07 slp

Is it possible to connect to the CI machine

Yes, and in fact I'm using a Mac that's already isolated from the rest of our system. @slp I messaged you on slack.

cevich avatar Jul 17 '24 15:07 cevich

Thanks to @cevich help, we were able to debug this issue. https://github.com/containers/libkrun/pull/209 fixes this.

slp avatar Jul 19 '24 11:07 slp

Thanks @slp for "getting your hands dirty" and figuring it out.

FYI- I'll be off on PTO next week, so won't be enabling libkrun testing in Podman until I return. There are changes needed on the Mac's used for CI and I don't want to risk breaking something while I'm away :wink:

cevich avatar Jul 19 '24 12:07 cevich

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Aug 19 '24 00:08 github-actions[bot]

@slp Is there a new krunkit version with the fix we can use?

Luap99 avatar Sep 19 '24 08:09 Luap99

@slp Is there a new krunkit version with the fix we can use?

Yes, it also includes the ability to increase the SHM window of virtio-gpu for running larger AI models. We're testing it now and we'll send a PR to podman by the end of the week.

slp avatar Oct 01 '24 07:10 slp