[WIP] Run CI on QEMU
Ok im confused :\
hmm, that last run provides a clue:
[ 21.213595] serial8250: too much work for irq4
Linux serial console bug?
Maybe this well help?
There's a 6h time limit on GitHub jobs. If it goes past that, it cancels the job. You guys probably already noticed that, but I thought it wouldn't hurt to say it.
Indeed, that is something that we will deal with after this first issue is fixed.
@fosslinux what is the current issue?
This is a good example.
https://github.com/fosslinux/live-bootstrap/actions/runs/18679236148/job/53256256615
Note how just midway through the first musl build after the new Linux kernel is run, QEMU suddenly dies?
It's very unclear why this is happening. Note that the serial8250: too much work for irq4 messages only appear sometimes.
Sometimes QEMU won't die but just hang instead.
@fosslinux do we need -serial for something or was it just an attempt at working out the problem or getting info on it?
I think these couple of run shows promise:
https://github.com/fosslinux/live-bootstrap/actions/runs/18630275708/job/53113976191 ("another test") https://github.com/fosslinux/live-bootstrap/actions/runs/18623287945/job/53097415539 ("jasdklfjasdklfj")
They were both cancelled by the time limit, which can be seen on their summary page:
GitHub recommends this approach instead of sudo. I don't think it's related to the instabilities, but it might be a better way to enable kvm than running as root.
I was attempting to do the exact same thing in my personal fork months ago. Here are some of my runs:
https://github.com/alganet/live-bootstrap/actions (look for the ones with qemu on the title).
At the time, I was consistently getting it to run up until the 6h limit.
@fosslinux do we need
-serialfor something or was it just an attempt at working out the problem or getting info on it?
Not needed, that was just a test.
I think these couple of run shows promise:
https://github.com/fosslinux/live-bootstrap/actions/runs/18630275708/job/53113976191 ("another test") https://github.com/fosslinux/live-bootstrap/actions/runs/18623287945/job/53097415539 ("jasdklfjasdklfj")
Unfortunately they don't either. That is the other case where the build hangs, look at the logs, the build process dies at a similar point. See the two hour gap between the last command output and the job being terminated? Same problem.
I was attempting to do the exact same thing in my personal fork months ago. Here are some of my runs:
https://github.com/alganet/live-bootstrap/actions (look for the ones with qemu on the title).
At the time, I was consistently getting it to run up until the 6h limit.
Thanks, thats helpful, I will inspect the differences.
@fosslinux
I see, you're right. For the first run, the cancelling happens after it hangs for a while without any output
However, for the second run "jasdklfjasdklfj", the timestamps on the GitHub raw logs indicate that qemu was producing output just before the process reached the time limit:
Unfortunatelly, the logs for my runs are not available anymore. One similar thing between my runs and the "jasdklfjasdklfj" run here is that both were invoking qemu outside python's subprocess.run. I don't know if that is related (seems unlikely), but I think is worth some job retries to see if it's consistent.
I'll probably do that test in my fork tomorrow, using the exact code from "jasdklfjasdklfj". If I find it to be consistent, I'll investigate some subprocess.run optional parameters or maybe an alternative to it. I will also report back if I discover that any of the runs hang.
New clue: "cat: write error: Resource temporarily unavailable". And then the output of "cat" is seemingly truncated.
Unfortunately, the run time of qemu suggests that it still likely failed at the exact same point as previous runs, suggesting that the serial console IRQ error is unrelated.
"jasdklfjasdklfj" has a bug, which is that the build process runs twice, that is why we see output right up until the time limit, because that is the second run of the build process.
I noticed some mounts are failing. This doesn't seem to happen on the bubblewrap version.
I am not familiar with that part. If it's not related and an issue for later, just ignore me :D