lima
lima copied to clipboard
Windows CI began to fail on Oct 21
#2769 passed the CI, but its merge commit and later ones are failing
https://github.com/lima-vm/lima/actions/runs/11429806278/job/31800191430
[…]
time="2024-10-21T01:14:55Z" level=info msg="SSH Local Port: 22"
time="2024-10-21T01:14:55Z" level=info msg="[hostagent] Waiting for the essential requirement 1 of 2: \"ssh\""
time="2024-10-21T01:15:05Z" level=info msg="[hostagent] Waiting for the essential requirement 1 of 2: \"ssh\""
time="2024-10-21T01:15:15Z" level=info msg="[hostagent] Waiting for the essential requirement 1 of 2: \"ssh\""
time="2024-10-21T01:24:43Z" level=fatal msg="did not receive an event with the \"running\" status"
Something seems to have changed between https://github.com/actions/runner-images/releases/tag/win22%2F20241006.1 and https://github.com/actions/runner-images/releases/tag/win22%2F20241015.1
In https://github.com/lima-vm/lima/actions/runs/11445753347/job/31843450765?pr=2778 I see:
System has not been booted with systemd as init system (PID 1). Can't operate.
@pendo324 Do you have any idea what may be causing the Windows tests to fail now?
I can't find anything that seems related in https://github.com/actions/runner-images/commit/fcc4cdb1d095af1317859c4809364538953b3497 or https://github.com/actions/runner-images/commit/09ff567de6908096a96ace47eb3f41079993366d
The errors look like systemd is no longer enabled in your distro, but there has been no change to the distro.
I'm at a loss on what might be causing this.
Thanks for pinging me, taking a look now
I'm doing some experiments with Windows support. I managed to replicate this CI attempt in my rebuild workflow. It worked successfully on a default GH runner. Logs are available https://github.com/arixmkii/qcw/actions/runs/13090314041/job/36526224725
The biggest difference in the setup is that I have to use latest preview WSL build from https://github.com/microsoft/WSL/releases
@jandubois I debugged this. There is actually related change in commits you showed. It is Git version bump. It uses OpenSSH from Git distribution.
The script "user session is ready for ssh" hangs indefinitely on Git 2.47 and newer releases. The same is the case for latest msys2 OpenSSH. I downgraded the Git on my system and managed to run WSL2 machine.
I also managed to run almost all integration tests with this WSL2 machine, when using OpenSSH inside Alpine companion distro in WSL2 (not using any of Windows tools) https://github.com/arixmkii/qcw/actions/runs/13474629971/job/37652601743
Conclusion. Machine didn't break, Windows tooling has some sort of issue/regression, which might or might not be fixed.
I tried to create an isolated reproducer using same script doing cat script sh | <openssh command from lima> in parallel to hanging one and was not able to reproduce it outside of Lima.
FWIW, https://github.com/git-for-windows/git/issues/5199 may be relevant. Note that git-for-Windows picked up the fixes, but as far as I know it's not in upstream cygwin/msys2 yet.
https://github.com/actions/runner-images/commit/fcc4cdb1d095af1317859c4809364538953b3497 linked above shows that Git for Windows was updated to 2.47.0.windows.1 which would be an affected version. But right now it shows 2.47.1.windows.2 which should be fixed (so some runs may be succeeding).
@mook-as Thank you! I tested with the updated runner (I'm using server 2025, but this should not really behave differently here) with git version 2.47.1.windows.2 and it passed tests https://github.com/arixmkii/qcw/actions/runs/13552437877/job/37879648090