kvm-guest-drivers-windows
kvm-guest-drivers-windows copied to clipboard
[virtio-fs] Suspected memory leak
Describe the bug This is a user report received on the Unraid Forums.
To Reproduce Running VM with virtiofs mappings
Expected behavior No Memory leakage.
Screenshots
Host:
- Disto: Unraid/Opensuse
- 6.1.63/Unknown
- QEMU version 7.2/8.1.2
- libvirt version 8.7/Unknown
VM:
- Windows version 10/11
- Which driver has a problem virtiofs or winfsp
- Driver version or commit hash that was used to build the driver see images
Additional context Mmdi is found in pooltag.txt so you actually have to use xperf and wpa for further debug. Following the method described there, I captured a snapshot of memory growth, opened it in wpa, loaded symbols, and expanded the Mmdi pool to find stack references to winfsp-x64.dll and virtiofs.exe. So there's the smoking gun, one of these drivers is the culprit.
I upgraded to the latest versions of WinFSP (2.0) and Virtio-win guest tools (0.1.240) and the leak is still active.
I've experienced this exact non-paged pool leak under Windows 11 using the latest released Virtio and WinFSP drivers. I've also tested with the latest released rust virtiofsd. When transferring files the non paged pool grows and the memory is never released. Looking with Poolmon it's always the Mmdi tag ballooning in memory usage.
I have been running a Windows 10 guest under KVM for about 1 year now, and I have experienced this memory leak for the entire time. I am passing through a GPU and a PCIE USB card, and I also mount a few host folders on the guest using VirtioFS. When there is heavy disk I/O on these VioFS folders, the RAM usage of the guest starts increasing rapidly until it eventually reaches 100% and runs out of swap and then crashes. The rate at which the RAM depletes varies with the amount of disk I/O on the VioFS folders. In the worst case (when backup program is running scanning all files), the RAM usage increases about 1M per second and the crash occurs in about 4 hours (16G of RAM allocated to guest). In order for the backup to complete, I have to reboot the guest multiple times to avoid the system crashing.
I found multiple sources that mentioned KVM memory ballooning causes memory leaks when used in combination with GPU passthrough. I set
I also tried disabling the virtio serial device. This also did nothing.
I followed Microsoft's guide to track down kernel memory leaks using poolmon. The memory is going to a Non-paged pool with the tag "Mmdi" and the description "MDLs for physical memory allocation".
I provided the debug info in the screenshot above, tracing the Mmdi pool growth to either virtiofs.exe or winfsp-x64.dll. I can assist with any further debug information required.
@SimonFair Thank you for your report. Maybe I am missing something, but where is the growth? BTW: to be a leak, the growth need to be consistent (not jumps, that might be related to temporary allocations).
Can you show "before" and "after" allocation counts? Also what's the amount of memory actually allocated?
I any case, worth investigation.
@YanVugenfirer The growth is consistent while a transfer is occuring. It's in the non paged pool. It can be observed with Poolmon and looking at the Mmdi tag as @christophocles explained. When using backup software it grows extremely quickly and will keep growing until it runs out of memory. Stopping the transfer will not free any memory. Only rebooting the VM will.
@YanVugenfirer here's a screenshot of poolmon showing the kernel memory pool usage after a few hours. The nonpaged pool Mmdi grows continuously grows, unbounded, until the system crashes. The growth is accelerated when there is a lot of disk read/write activity on the virtiofs shares. The list is sorted by bytes allocated, and Mmdi is the highest with 4.2 GB.
And here is poolmon immediately after rebooting the guest. Mmdi is only 4.6 MB.
Here is another capture of the Mmdi growth using xperf and wpa. This capture is 6 minutes, with 225MB of memory allocations.
I am not sure if this bug report should going to this project or to WinFSP. Both seem to be involved with the Mmdi allocations.
@christophocles Thanks a lot! We will take a look and investigate.
I tried to reproduce this issue with the latest rust virtiofsd and virtio driver(242) on Win11 guest, but didn't reproduce it.
- Mounted one virtiofs shared dir.
- Run fio in the shared dir.
C:\Program Files (x86)\fio\fio\fio.exe" --name=stress --filename=Z:/test_file --ioengine=windowsaio --rw=write --direct=1 --size=1G --iodepth=256 --numjobs=128 --runtime=180000 --thread --bs=64k
- Monitor with poolmon.exe, but there was no memory leak.
@SimonFair Could you share what the IO operation in your env? Thanks in advance.
@SimonFair are you using Rust virtiofsd?
I've tested this on rust Virtiofsd under Unraid and had the Mmdi leak. Perhaps @christophocles has more insight. I believe he used a different distro per our conversation on the Unraid forums and may be able to share what occured on that platform.
@xiagao The latest drivers we were able to get were .240. How do we test with .242? Can you provide a binary for us to test with?
@SimonFair are you using Rust virtiofsd?
@YanVugenfirer The bug report originated from my system, and others on the Unraid forums have reported the same issue. Yes, I am using Rust virtiofsd 1.7.2 which is the version currently packaged on openSUSE Tumbleweed.
@xiagao I am also using virtio-win driver version 0.240 since that is the latest binary release. I have visual studio and driver toolkits installed so my environment set up to compile newer drivers from source, if needed for testing. Tonight I will spin up a fresh Win10 VM and try to reproduce the leak again myself, with minimum required steps. It's possible that other features my specific system are interacting to trigger the memory leak (i.e. PCI-E passthrough?). If I am able to successfully reproduce the leak on a new VM, I will post detailed steps to reproduce.
@christophocles @SimonFair
@YanVugenfirer The bug report originated from my system, and others on the Unraid forums have reported the same issue. Yes, I am using Rust virtiofsd 1.7.2 which is the version currently packaged on openSUSE Tumbleweed.
The latest Rust virtiofsd 1.8.0. Please try it.
@kostyanf14 I ran virtiofsd 1.8.0 on Unraid and ran into the same memory leak.
2. C:\Program Files (x86)\fio\fio\fio.exe" --name=stress --filename=Z:/test_file --ioengine=windowsaio --rw=write --direct=1 --size=1G --iodepth=256 --numjobs=128 --runtime=180000 --thread --bs=64k
Where is fio.exe? I only have C:\Program Files\Virtio-Win\VioFS\virtiofs.exe
- C:\Program Files (x86)\fio\fio\fio.exe" --name=stress --filename=Z:/test_file --ioengine=windowsaio --rw=write --direct=1 --size=1G --iodepth=256 --numjobs=128 --runtime=180000 --thread --bs=64k
Where is fio.exe? I only have C:\Program Files\Virtio-Win\VioFS\virtiofs.exe
Hi, you can find fio binary in https://fio.readthedocs.io/en/latest/fio_doc.html .
@kostyanf14 I ran virtiofsd 1.8.0 on Unraid and ran into the same memory leak. Could you share what io test did you do on the shared folder? I also will try some other tools, such as iozone and iometers.
What always does it for me is a free trial of Backblaze Personal Backup and letting it back up my large media library stored on a VirtioFS mount. That will cause Mmdi to grow very quickly.
I should also add I use this batch script to mount several Unraid shares as different drive letters:
"C:\Program Files (x86)\WinFsp\bin\launchctl-x64.exe" start virtiofs viofsJ Tag1 J: "C:\Program Files (x86)\WinFsp\bin\launchctl-x64.exe" start virtiofs viofsl Tag2 l: "C:\Program Files (x86)\WinFsp\bin\launchctl-x64.exe" start virtiofs viofsM Tag3 m: "C:\Program Files (x86)\WinFsp\bin\launchctl-x64.exe" start virtiofs viofsS Tag4 s: "C:\Program Files (x86)\WinFsp\bin\launchctl-x64.exe" start virtiofs viofsT Tag5 T:
I previously ran: C:\Program Files (x86)\WinFsp\bin\fsreg.bat" virtiofs "C:\Program Files\Virtio-Win\VioFS\virtiofs.exe" "-t %%1 -m %%2"
Has anyone been able to repro this?
I reproduced this issue with multiple source mapping from host to Win11 guest. Using IOmeter software to create a lot of disk read/write activity on the virtiofs shares.
Here are some screenshots showing nonpaged pool Mmdi grows continuously after starting IO test and the memory isn't released after stop IO test.
@xiagao I'm glad it's not just us! Thank you for your effort. So hopefully this can eventually be fixed!
@xiagao I'm glad it's not just us! Thank you for your effort. So hopefully this can eventually be fixed! No problem.
Thanks for reporting this issue.
Thank you! Can't wait to finally use Virtiofs.
Is there a fix for the issue or has the root cause been found?
@SimonFair not yet. Due to holidays time we are not yet got to debug it.
Just came to report my issue with the memory leaks too.
Running a win11 VM for security cameras writing about 50mbps constantly over virtiofs will chew up my 16gb allocated ram in about 24 hours.
Hope your team had a good holiday period and will look back into this in coming weeks / months for any updates.
It'll be great to know if any progress has been made on this bug?
@mackid1993 No progress due to the holiday season
@kostyanf14 and I found an issue that caused the memory leak (hopefully the only one). Soon CI will build the driver that can be tested if anyone is interested.
@YanVugenfirer Can you please provide a link to the driver once it's been built? Thank you!