RVVM
RVVM copied to clipboard
Monotonic timer issues when pausing vCPUs
The dilemma
- Keeping the monotonic timer (
riscv_clocksource) running when stopping the guest vCPUs results in guest software going mad and believing some kind of hang occured, watchdogs going off, etc - Stopping the monotonic timer together with vCPUs fixes the random guest crashes, but results in eventual time dilation inside the guest (
datereports wall clock from the past). This is caused due to guests using monotonic clock to offset boot date timestamp, instead of querying the RTC reliably. Might break SSL at some point, especially if we consider VM snapshots. - We don't always have control over host monotonic timer in suspend. On Linux,
CLOCK_MONOTONICreports time that the system actually was running, but on Windows,QueryPerformanceCounterkeeps running in suspend leading to pt. 1
Steps to reproduce
- Run RVVM on a Linux host
- Suspend the host system
- Wait around 20 minutes
- Wake the host system
- Run
datein guest, observe wall clock time being 20 minutes into the past
- Run RVVM on a Windows host, preferably with an Arch RISC-V guest or anything else running systemd
- Suspend the host system
- Wait around 20 minutes
- Wake the host system
- Boom: Hung task stacktraces in dmesg, random systemd services going down, etc. Possible reboot of the guest system.
Workarounds
- Do not pause VMs and don't suspend your host, nothing else to suggest as of now
Suggested fix (Needs guest assistance)
- Uniformly pause VM clocksource when not running vCPUs. Provide some kind of RTC re-read signal to the guest, or run an NTP service to prevent wall clock time dilation.
Guest system need some kind of resume after sleep handling to function properly. For example reset RTC clock to value from NTP server on resume.
Guest system need some kind of resume after sleep handling to function properly. For example reset RTC clock to value from NTP server on resume.
RTC clock always reports correct wall clock & date. However Linux guests (and I imagine some other systems) read RTC only once per boot, and then add monotonic time since boot to it to provide current date instead of repeatedly reading the RTC.
I have not yet found a way to notify the guest "hey go read the RTC", nor a device spec which even has such feature.
Some other VMs like VirtualBox are implementing guest wall clock winding via guest additions (literally VM-specific drivers & software running in guest). But no generic device with an existing driver to do that.
I think it is good idea to introduce some "guest control device" that will handle VM-specific tasks. It can for example send interrupt and message when VM is resumed. It can also be used for graceful shutdown handling by VM window close button.
If we're running additional software in guests then we may use existing software to sync from NTP or RTC device. This is not exactly a new problem (affects QEMU, virt too) and people are using chrony or systemd-timesyncd to solve this.
See: https://serverfault.com/questions/334698/how-to-keep-time-on-resumed-kvm-guest-with-libvirt
In QEMU om emulation side they internally just allow switching between the two variants of behavior that already are experienced with RVVM (delay and discard). They also have a catchup mode which speeds up the timer for small period after resume, but it's a pretty broken solution (Imagine animations, games, timeouts speeding 10x for some time after resuming). By default they're using delay (pause timer together with vCPUs).
It can also be used for graceful shutdown handling by VM window close button.
Please implement shutdown handling in Haiku on HID_KEY_POWER (0x66) key. That's all that needed really (Otherwise we still need to implement something on guest side, but that's not gonna be a standard thing either).
I'll note that this behavior in general is not implemented in QEMU and such, and the guests are expected to have some power resilience implemented in their FS. I've never had an Linux VM with ext4 corrupt due to abrupt RVVM window closing.
If chrony or some other solution doesn't solve the issue then we may write our own service which directly reads the RTC, say, each 10-30 seconds and corrects the system date. It will solve this issue for RVVM and other VM solutions, and may be later packaged for many distros. This service may then be ported over to Haiku too.
The biggest downside is the effort needed to write such service correctly and maintain it.
Please implement shutdown handling in Haiku on HID_KEY_POWER (0x66) key
Is it already implemented on RVVM side? For me it is fine that window close button will act as shutdown button and if it is required to force VM termination (guest frozen etc.), VM process can be killed by some external method (process manager etc.).
guests are expected to have some power resilience implemented in their FS.
Problem is not a FS corruption, but a loss of unsaved data and state.
Is it already implemented on RVVM side? For me it is fine that window close button will act as shutdown button and if it is required to force VM termination (guest frozen etc.), VM process can be killed by some external method (process manager etc.).
I want the second exit button click to kill the VM. However maybe some kind of timeout may be established there. For now I'll submit a patch for Haiku window implementation for you to test. Later some window system rewrite is expected to merge some of the functionality.
Since 0666655, RVVM should behave on Windows hosts in suspend the same way it does on Linux (Monotonic clocksource skips time in suspend). Some other systems may need a similar fix.
MacOS mach_absolute_time() should not count in suspend, but CLOCK_MONOTONIC does, so mach_absolute_time() should be prioritized. OpenBSD provides CLOCK_UPTIME for this.
Fixed clocksource jumps after suspend on MacOS and OpenBSD too.
Now all that's left is to pause clocksource when RVVM machine is explicitly paused, etc.
https://lwn.net/Articles/429925/ might help a bit
Many guests at this point have proper support for NTP out of the box (Arch RISC-V, Debian, Fedora). After prolonged periods of suspend on the host and later wakeup, guests sync their wall clocks to actual real-world clock (Currently obtained from the internet).
As the primary issue with guest wall clock drift was eventual inability to access SSL-protected internet resources (web/distro repos), I consider this issue to be now resolved.
Other guest operating systems might need special steps to enable NTP clock sync, but that's beyond RVVM responsibility