ic
ic copied to clipboard
chore: Add debug logging of inactive (stuck) domains
We have seen that in a small number of cases the QEMU process gets locked up after VM shutdown and cannot be killed by libvirtd. This PR adds code that monitors when such case happens and first logs the state of the QEMU process and then reboots the machine.
Unfortunately, this is a fairly unusual state in libvirt, therefore we cannot easily simulate it in tests without creating mocks for libvirt API. However, I tested the code manually with a custom-built libvirtd where I can simulate the lockup problem.
The PR also refactors error handling and moves some of the previously non-tested code to be under the tested code path.