CI: vfkit broken on macos tahoe runners
I think vfkit v0.6.2 might be broken at least in our CI env.
→ Enter [It] no settings should change if no flags - /Users/MacM1-5-worker/ci/task-6413995885723648/pkg/machine/e2e/set_test.go:96 @ 01/07/26 09:22:16.64
/Users/MacM1-5-worker/ci/task-6413995885723648/bin/darwin/podman machine init --disk-size 11 --image /private/tmp/ci/podman-machine.aarch64.applehv.raw a509bf48caa4
Machine init complete
To start your machine run:
podman machine start a509bf48caa4
/Users/MacM1-5-worker/ci/task-6413995885723648/bin/darwin/podman machine set a509bf48caa4
/Users/MacM1-5-worker/ci/task-6413995885723648/bin/darwin/podman machine start a509bf48caa4
Starting machine "a509bf48caa4"
Error: vfkit exited unexpectedly with exit code 1
[FAILED] Expected
<int>: 125
to match exit code:
<int>: 0
In [It] at: /Users/MacM1-5-worker/ci/task-6413995885723648/pkg/machine/e2e/set_test.go:112 @ 01/07/26 09:22:24.151
Full Stack Trace
github.com/containers/podman/v6/pkg/machine/e2e_test.init.func18.3()
/Users/MacM1-5-worker/ci/task-6413995885723648/pkg/machine/e2e/set_test.go:112 +0x3fc
https://api.cirrus-ci.com/v1/artifact/task/6413995885723648/html/machine-applehv-podman-darwin-rootless-host.log.html#t--podman-machine-set-set-machine-cpus--disk--memory--1
I guess someone needs to run that with log level debug to get the error message?
cc @baude @ashley-cui @cfergeau
Thanks for the heads up! I’ve created a v0.6.2 tag yesterday, but haven’t fully finished the release yet, it’s good to have early feedback!
I’ve just tested podman 5.7.1 with the unsigned binary from https://github.com/crc-org/vfkit/actions/runs/20755542863 and podman machine start worked fine, so it’s not totally broken. I’ll take a closer look at the output from your tests.
I cannot reproduce these failures locally on my M1, not sure what fails in CI, I’ll try again tomorrow.
To be clear I am not confident that vfkit is the issue I just noticed the new bump so I thought it is related. Without a reproducer it might still be related to something else. I know I bumped the our macos runners to tahoe but that was on monday but maybe it is related to that since we didn't have other runs since then.
Ok I finally checked on the failing macos worker it runs on vfkit 0.6.1 unless I am missing something about the CI setup.
So the version bump was a red herring to me, but it does seem to fail on our worker with tahoe so I guess we need to try to reproduce on that, maybe something releated to the aws image we pull in there
ProductName: macOS
ProductVersion: 26.2
BuildVersion: 25C56
i cannot reproduce this in my mac with.
➜ ~ podman -v
podman version 5.7.1
➜ ~ vfkit -v
vfkit version: v0.6.1
➜ ~ sw_vers
ProductName: macOS
ProductVersion: 26.2
BuildVersion: 25C56
this could be ami image issue? @Luap99
@cfergeau did you guys hit anything like this in your testign with tahoe?
@cfergeau did you guys hit anything like this in your testing with tahoe?
Testing on Tahoe is not as extensive as we’d like as the only macos runners with virtualization support are limited to macos 15. I should take a look at what you are using for your macOS e2e tests, maybe vfkit could use something similar.
There are no known regressions that I know of when moving from 15 to 26, or from vfkit 0.6.x to 0.6.2, but I’m trying to reproduce this issue to get a better idea of what’s going on. I’m running the e2e tests with podman 5.7.1 for now, but no failure. This report was against the main branch though, so I’ll be testing this next.
With a local run on Tahoe/M1/vfkit 0.6.2, I can’t reproduce the failures with podman 5.7.1 nor with podman main:
Summarizing 1 Failure:
[PANICKED!] podman machine compose [It] compose test environment variable setup
/opt/homebrew/Cellar/go/1.25.5/libexec/src/runtime/panic.go:115
The failure looks like a test issue, not something vfkit-related:
[PANICKED] Test Panicked
In [It] at: /opt/homebrew/Cellar/go/1.25.5/libexec/src/runtime/panic.go:115 @ 01/08/26 10:04:41.637
runtime error: index out of range [1] with length 1
Full Stack Trace
github.com/containers/podman/v6/pkg/machine/e2e_test.init.func7.1()
/Users/teuf/dev/podman/pkg/machine/e2e/compose_test.go:44 +0x6a8
I’m using this to run the test locally:
TMPDIR=/private/tmp make ginkgo-run GINKGO_PARALLEL=n TAGS="remote exclude_graphdriver_btrfs containers_image_openpgp" GINKGO_FLAKE_ATTEMPTS=0 FOCUS_FILE= GINKGOWHAT=pkg/machine/e2e/.
Just realized that most of the local tests with the main branch were running with krunkit so I don’t know if I tested the right thing.
Sorry I just saw https://github.com/containers/podman/pull/27875 passes while https://github.com/containers/podman/pull/27872 (the vfkit code update) fails. I just assumed since the binary execution failed (which we don't update on the runner based on the PR) it must be related to the environment not the code chnage itself. But maybe the new vfkit code passes something as argument that the older vfkit binary cannot understand?
yeah the issue is that the new vfkit code forces a new option AFAICT so this upgrade is not backwards compatible.
Since we have little control over all the packaging of podman we cannot enforce that the vfkit go code in podman must match the binary version on the host so we must fix that in a way that preserve backwards compatibility
Error: unknown option for virtio-net devices: type
Usage:
vfkit [flags]
...
Thanks for the investigation. Hopefully https://github.com/cfergeau/vfkit/commit/f504b6ac1b74c114279b652c7a0de8f65bcc22b8 will fix this, I need to test it.
I’m using this to run the test locally:
TMPDIR=/private/tmp make ginkgo-run GINKGO_PARALLEL=n TAGS="remote exclude_graphdriver_btrfs containers_image_openpgp" GINKGO_FLAKE_ATTEMPTS=0 FOCUS_FILE= GINKGOWHAT=pkg/machine/e2e/.
CONTAINERS_MACHINE_PROVIDER=applehv also needs to be set on the main branch in order to use vfkit in e2e tests. With this + go get github.com/crc-org/vfkit && go mod tidy && go mod vendor, and with vfkit 0.6.1 installed, I was able to reproduce.
https://github.com/cfergeau/vfkit/commit/f504b6ac1b74c114279b652c7a0de8f65bcc22b8 solves the issue.
This should be fixed if podman uses the go code from github.com/crc-org/[email protected]
Thank you @cfergeau!