qubes-issues icon indicating copy to clipboard operation
qubes-issues copied to clipboard

AppVM with GPU pass-through crashes when more than 3.5 GB (3584MB) of RAM is assigned to it

Open pqyptixa opened this issue 5 years ago • 33 comments

Qubes OS version:

R4.0

Affected component(s):

Debian 10 AppVM


Steps to reproduce the behavior:

So, I finally could pass a GPU to a VM with PCI passthrough some weeks ago, but for some reason, I can't get it to run with more RAM. I asked #qubes_os @ freenode about this issue, and ended up making a guide on how to get GPU-passthrough done. I'm not going to post the complete guide here, so here's a link to it: https://paste.debian.net/1043341/ . I don't mind if someone else posts it here or elsewhere, but hopefully people will discuss it and improve it. My system: Qubes 4.0 running on a AMD Ryzen 1600 CPU + MSI B350 Mortar motherboard (BIOS 1E0) + ASUS RX560 2GB GPU for the VMs + an old NVIDIA 610 GPU for dom0 (as secondary GPU).

Expected behavior:

AppVM runs just fine with > 3.5GB of RAM.

Actual behavior:

AppVM crashes with more than 3.5GB of RAM. The way it crashes depends on how much memory is assigned to it, but I usually see almost-instant kernel panics, systemd-init crashing, fsck failures and/or other random processes crashing, too.

General notes:

Other than removing "nomodeset" from the kernel command line, everything is set as default.

PS: link to the guide expired. Here a a couple of similar guides: https://github.com/Qubes-Community/Contents/blob/master/docs/customization/windows-gaming-hvm.md and https://neowutran.ovh/qubeos.html

pqyptixa avatar Sep 19 '18 18:09 pqyptixa

@pqyptixa how did you get the PCI passthrough to work? Would you be willing to explain your steps?

Jeeppler avatar Sep 19 '18 18:09 Jeeppler

@Jeeppler I left a link to the guide, but PCI passthrough "just worked" for me. I didn't need to mess with IOMMU groups or anything (in fact, "/sys/kernel/iommu_groups/" is empty in my system), simply passing the device by running "qvm-pci attach ..." worked.

pqyptixa avatar Sep 19 '18 19:09 pqyptixa

@pqyptixa thanks, for the explanation.

Jeeppler avatar Sep 19 '18 21:09 Jeeppler

This is caused by the default TOLUD (Top of Low Usable DRAM) of 3.75G provided by qemu not being large enough to accommodate the larger BARs that a graphics card typically has.

The code to pass a custom max-ram-below-4g value to the qemu command line does exist in the libxl_dm.c file of xen, but there is no functionality in libvirt to add this parameter.

It is possible to manually add this parameter to the qemu commandline by doing the following in a dom0 terminal:

mkdir stubroot
cp /usr/lib/xen/boot/stubdom-linux-rootfs stubroot/stubdom-linux-rootfs.gz
cd stubroot
gunzip stubdom-linux-rootfs.gz
cpio -i -d -H newc --no-absolute-filenames < stubdom-linux-rootfs
rm stubdom-linux-rootfs
nano init

Before the line "# $dm_args and $kernel are separated with \x1b to allow for spaces in arguments." add:

SP=$'\x1b'
dm_args=$(echo "$dm_args" | sed "s/-machine\\${SP}xenfv/-machine\\${SP}xenfv,max-ram-below-4g=3.5G/g")

Then execute:

find . -print0 | cpio --null -ov --format=newc | gzip -9 > ../stubdom-linux-rootfs
sudo mv ../stubdom-linux-rootfs /usr/lib/xen/boot/

Note that this will apply the change to all HVMs, so if you have any other HVM with more than 3.5G ram assigned, they will not start without the adapter being passed through.

Ideally to fix this libvirt should be extended to pass the max-ram-below-4g parameter through to xen, and then a calculation added to determine the correct TOLUD based on the total BAR size of the PCI devices are being passed through to the vm.

alcreator avatar Sep 20 '18 01:09 alcreator

@alcreator Awesome, thanks for replying! I suspected QEMU was related to this, but I had no idea where to look for info, I'm really glad you replied! I was able to make it work by adding your dm_args line to the init file with a small modification, so that it doesn't affect other VMs:

vm_name=$(xenstore-read "/local/domain/$domid/name")
SP=$'\x1b'
if [ x"$vm_name" == x"my_appvm" ]; then
    dm_args=$(echo "$dm_args" | sed "s/-machine\\${SP}xenfv/-machine\\${SP}xenfv,max-ram-below-4g=3.5G/g")
fi

Obviously this is ugly and hacky, but it works :D

Hopefully this will help other Qubes users! (Though here's a reminder: GPU passthrough is actually a security risk...) BTW, does /usr/lib/xen/boot/stubdom-linux-rootfs ever get updated?

pqyptixa avatar Sep 20 '18 03:09 pqyptixa

/usr/lib/xen/boot/stubdom-linux-rootfs will get updated every time the "xen-hvm-stubdom-linux" package is updated.

alcreator avatar Sep 20 '18 09:09 alcreator

Just in case someone had problems with the modified script I posted above: the correct command is vm_name=$(xenstore-read "/local/domain/$domid/name") instead of vm_name=$(xenstore-read "local/domain/$domid/name") as I had posted originally.

pqyptixa avatar Nov 04 '18 04:11 pqyptixa

Hopefully this will help other Qubes users! (Though here's a reminder: GPU passthrough is actually a security risk...)

What is the security risk if Dom0 is driven by a different GPU, and not the one that is assigned to passthrough? OP seems to have 2 GPU cards, one for Dom0 and one for passthrough. IIRC, the risk to passthrough of GPU was being able to see Dom0 screen from the VM.

ThePlexus avatar Dec 06 '18 10:12 ThePlexus

Hopefully this will help other Qubes users! (Though here's a reminder: GPU passthrough is actually a security risk...)

What is the security risk if Dom0 is driven by a different GPU, and not the one that is assigned to passthrough? OP seems to have 2 GPU cards, one for Dom0 and one for passthrough. IIRC, the risk to passthrough of GPU was being able to see Dom0 screen from the VM.

I'm not an expert, but AFAIK, the problem is DMA. Not really sure about this, but AFAIU, an attacker could be able to access system memory.

pqyptixa avatar Jan 10 '19 14:01 pqyptixa

I dont claim to be an expert either, but from my understanding VT-d/IOMMU, assuming your firmware is trustworthy, should mean a PCI passthrough of a 2nd (unused) GPU to a virtual machine should have no access to Dom0 screen or RAM. From my limited understanding, the risk is passing through the primary GPU . maybe @rootkovska has more insight?

ThePlexus avatar Jan 10 '19 17:01 ThePlexus

On Thu, Jan 10, 2019 at 09:06:46AM -0800, shamen123 wrote:

I dont claim to be an expert either, but from my understanding VT-d/IOMMU, assuming your firmware is trustworthy, should mean a PCI passthrough of a 2nd (unused) GPU to a virtual machine should have no access to Dom0 screen or RAM. From my limited understanding, the risk is passing through the primary GPU.

Yes, that's right.

One issue previously mentioned frequently about GPU passthrough was that it wasn't working together with qemu in stubdomain so the procedure included running qemu directly in dom0, which is a security risk. But it isn't the case here, in Qubes 4.0 - here it works with qemu still in stubdomain.

marmarek avatar Jan 10 '19 20:01 marmarek

This is caused by the default TOLUD (Top of Low Usable DRAM) of 3.75G provided by qemu not being large enough to accommodate the larger BARs that a graphics card typically has.

The code to pass a custom max-ram-below-4g value to the qemu command line does exist in the libxl_dm.c file of xen, but there is no functionality in libvirt to add this parameter.

It is possible to manually add this parameter to the qemu commandline by doing the following in a dom0 terminal:

mkdir stubroot
cp /usr/lib/xen/boot/stubdom-linux-rootfs stubroot/stubdom-linux-rootfs.gz
cd stubroot
gunzip stubdom-linux-rootfs.gz
cpio -i -d -H newc --no-absolute-filenames < stubdom-linux-rootfs
rm stubdom-linux-rootfs
nano init

Before the line "# $dm_args and $kernel are separated with \x1b to allow for spaces in arguments." add:

SP=$'\x1b'
dm_args=$(echo "$dm_args" | sed "s/-machine\\${SP}xenfv/-machine\\${SP}xenfv,max-ram-below-4g=3.5G/g")

Then execute:

find . -print0 | cpio --null -ov --format=newc | gzip -9 > ../stubdom-linux-rootfs
sudo mv ../stubdom-linux-rootfs /usr/lib/xen/boot/

Note that this will apply the change to all HVMs, so if you have any other HVM with more than 3.5G ram assigned, they will not start without the adapter being passed through.

Ideally to fix this libvirt should be extended to pass the max-ram-below-4g parameter through to xen, and then a calculation added to determine the correct TOLUD based on the total BAR size of the PCI devices are being passed through to the vm.

You say in your Workaround the stubdom-linux-rootfs should be in /usr/lib/xen/

but i only found it in /usr/lib64/xen I changed the stubdom file as advised but i still cant get the HVM started with more than 3G

but i installed a QubesR4.1 Version and maybe that is the reason. I will install the new Qubes r4.0.3 now and look if i can get it to work there!

Lafachief avatar Mar 31 '20 17:03 Lafachief

@andrewdavidwong - I see this issue added to the "far in the future" milestone back in 2018. Are we in the future yet? I am hitting same issue and looking to see if there is a configurable way to increase the default TOLUD setting.

yeet648 avatar May 31 '20 03:05 yeet648

@andrewdavidwong - I see this issue added to the "far in the future" milestone back in 2018. Are we in the future yet? I am hitting same issue and looking to see if there is a configurable way to increase the default TOLUD setting.

That milestone was recently renamed "TBD," which better reflects the meaning of the milestone. Basically, the devs have not had a chance to decide whether and when, specifically, to tackle this. It's on the extremely long (and ever-growing) list of things we'd eventually like to do or see done. If you're up to it, patch contributions are always welcome, though we strongly encourage you to discuss your plan of work here before investing significant time.

andrewdavidwong avatar Jun 01 '20 21:06 andrewdavidwong

The link to the GPU guide is not working.

keldnorman avatar Jul 16 '20 12:07 keldnorman

It would be nice if the Qube manager UI had ways to pass options to QEMU, sort of like the storage, network, ... tabs in Virtualbox. Apart from the TOLUD stuff, I've had issues with emulated disk controllers, and it would be useful to be able to change networking options, too.


@keldnorman sadly, the link expired.. Dumb me didn't think about that little detail when I posted that guide and I couldn't even find it through archive.org, but here are a couple other guides with more or less the same info:

https://neowutran.ovh/qubeos.html

https://github.com/Qubes-Community/Contents/blob/master/docs/customization/windows-gaming-hvm.md

pqyptixa avatar Jul 17 '20 23:07 pqyptixa

Just wanted to update that it also occurs when attaching USB devices instead of PCI devices. Perhaps there's some kind of sharing on my board but when attaching USB devices with memory above a certain amount i end up with similar issues as outlined in https://github.com/QubesOS/qubes-issues/issues/

zellchristensen avatar Oct 22 '20 20:10 zellchristensen

Any advice on how to patch Qubes 4.1?

blobless avatar Feb 07 '21 19:02 blobless

I am starting to switch from 4.0 to 4.1

Patch. Before the line "# $dm_args and $kernel are separated with \n to allow for spaces in arguments", add:

# Patch 3.5 Go limit
vm_name=$(xenstore-read "/local/domain/$domid/name")
if [ $(echo "$vm_name" | grep -iEc '^gpu_' ) -eq 1 ]; then
 dm_args=$(echo "$dm_args" | sed -n '1h;2,$H;${g;s/\(-machine\nxenfv\)/\1,max-ram-below-4g=3.5G/g;p}')
fi

The qube name need to start with "gpu_". From my first tests, it seems to work as expected on my gaming windows HVM and gaming linux HVM, without breaking the other things

neowutran avatar Jun 21 '21 19:06 neowutran

I am starting to switch from 4.0 to 4.1

Patch. Before the line "# $dm_args and $kernel are separated with \n to allow for spaces in arguments", add:

# Patch 3.5 Go limit
vm_name=$(xenstore-read "/local/domain/$domid/name")
if [ $(echo "$vm_name" | grep -iEc '^gpu_' ) -eq 1 ]; then
 dm_args=$(echo "$dm_args" | sed -n '1h;2,$H;${g;s/\(-machine\nxenfv\)/\1,max-ram-below-4g=3.5G/g;p}')
fi

The qube name need to start with "gpu_". From my first tests, it seems to work as expected on my gaming windows HVM and gaming linux HVM, without breaking the other things

I tried this for Qubes 4.1 RC1 but it did not work. Any suggestions?

ghost avatar Nov 15 '21 16:11 ghost

Automated announcement from builder-github

The component vmm-xen-stubdom-linux (including package xen-hvm-stubdom-linux-1.2.4-1.fc32) has been pushed to the r4.1 testing repository for dom0. To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

qubesos-bot avatar Apr 22 '22 16:04 qubesos-bot

I have just updated to the latest xen-hvm-stubdom-linux from the current-testing repository and this issue seems unfixed.

AppVMs crash with more than 3.5G of ram assigned to them when doing GPU passthrough, usually with a no bootable device error.

This update also seems to stop the original file patching hack from working as I also get a crash from my HVMs even with that.

Dexkant avatar Apr 28 '22 00:04 Dexkant

I have the same problem as @Dexkant. But I don't have crashes and it's unrelated to the ram size. I'm getting No bootable device error regardless of the assigned VM ram size. It seems that this commit doesn't fix the issue: https://github.com/QubesOS/qubes-vmm-xen-stubdom-linux/commit/be8896ba2fae2e377aa236df4407bc7dbdcec60b If I downgrade xen-hvm-stubdom-linux and xen-hvm-stubdom-linux-full to the previous version 1.2.3-1.fc32 and apply this patch https://github.com/QubesOS/qubes-issues/issues/4321#issuecomment-865273085 then I can boot my Windows HVM and it works fine. Also I don't know if it's worked before but if I set any of these features with version 1.2.3-1.fc32:

qvm-features gpu_win10 audio-model ich9
qvm-features gpu_win10 stubdom-qrexec 1

Then I'm getting stuck at the black screen after Windows HVM start. And I'm not able to passthrough USB devices because stubdom-qrexec is not enabled.

UPD: Fixed USB devices passthrough with version 1.2.3-1.fc32 by patching qemu-stubdom-linux-full-rootfs as well.

tzwcfq avatar Apr 28 '22 16:04 tzwcfq

Some new info. I have Windows 10 HVM with Qubes Windows Tools 4.1.67.1. If I use xen-hvm-stubdom-linux/xen-hvm-stubdom-linux-full 1.2.4-1.fc32, HVM with 3 GB and 4 GB RAM, no PCI devices attached - it works fine. If I use xen-hvm-stubdom-linux/xen-hvm-stubdom-linux-full 1.2.4-1.fc32, HVM with 3 GB and 4 GB RAM, any PCI device attached (I've tried GPU and USB controller) - No bootable device error. If I use xen-hvm-stubdom-linux/xen-hvm-stubdom-linux-full 1.2.3-1.fc32 without patch applied, HVM with 3 GB and 4 GB RAM, with and without PCI USB controller attached - it works fine. If I use xen-hvm-stubdom-linux/xen-hvm-stubdom-linux-full 1.2.3-1.fc32 without patch applied, HVM with 3 GB RAM, with PCI USB controller and GPU attached - it works fine. If I use xen-hvm-stubdom-linux/xen-hvm-stubdom-linux-full 1.2.3-1.fc32 without patch applied, HVM with 4 GB RAM, with PCI USB controller attached - it works fine. If I use xen-hvm-stubdom-linux/xen-hvm-stubdom-linux-full 1.2.3-1.fc32 without patch applied, HVM with 4 GB RAM, with PCI GPU attached - it start to boot with black screen then it fails with this error: dom0 qubesd[3557]: vm.gpu_win10: Start failed: qrexec-daemon startup failed: 2022-05-07 12:27:56.268 qrexec-daemon[44408]: qrexec-daemon.c:135:sigchld_parent_handler: Connection to the VM failed and in /var/log/libvirt/libxl/libxl-driver.log:

libxl: libxl_pci.c:1489:libxl__device_pci_reset: The kernel doesn't support reset from sysfs for PCI device 0000:01:00.1
libxl: libxl_pci.c:1484:libxl__device_pci_reset: write to /sys/bus/pci/devices/0000:01:00.0/reset returned -1: Inappropriate ioctl for device

If I use xen-hvm-stubdom-linux/xen-hvm-stubdom-linux-full 1.2.3-1.fc32 with patch applied, HVM with 3 GB RAM, without PCI devices / with PCI USB controller / with PCI USB controller and GPU attached - it works fine. If I use xen-hvm-stubdom-linux/xen-hvm-stubdom-linux-full 1.2.3-1.fc32 with patch applied, HVM with 4 GB RAM, without PCI devices / with PCI USB controller attached - No bootable device error. If I use xen-hvm-stubdom-linux/xen-hvm-stubdom-linux-full 1.2.3-1.fc32 with patch applied, HVM with 4 GB RAM, with just PCI GPU / with PCI USB controller and GPU attached - it works fine.

UPD: Also tried Windows 10 with QWT 4.1.68-1 installed with qubes-dom0-update from current-testing and Windows 10 with QWT uninstalled - No bootable device error remains.

tzwcfq avatar May 07 '22 05:05 tzwcfq

GPU: AMD Radeon RX 6900 XT Skill level: My knowledge in this area is very limited, mostly reading these threads looking for solutions.

My windows vm with RX 6900 XT didn't boot after updating. I speculate this update contained the PR https://github.com/QubesOS/qubes-vmm-xen-stubdom-linux/pull/44 by @mati7337 and is on the version @tzwcfq was testing.

I got it working again but I've switched to testing branch and will need someone else who has this same issue to check on the main branch. Combination that worked for me:

  • BIOS: Above 4G Decoding: Enable (I had this enabled already, setting it to Disabled had a different issue)
  • BIOS: Resizable bar: Disable (Had to change this from Enabled to Disabled after the update)
  • Reapplied patch to both files: https://github.com/QubesOS/qubes-issues/issues/4321#issuecomment-865273085

(more info on steps) When I ran my script to patch both stubdom files, it was showing the files with the patch already applied. This doesn't make since as the subdom files should have been updated so I decided to start fresh by doing an update to the current testing branch. This gave me clean stubdom files that I was then able to patch. sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing pulling in xen-hvm-stubdom-linux 1.2.4-1.fc32 and xen-hvm-stubdowm-linux-full with the same version. I also reapplied the stubdom patches after it did boot with 3.5GB of ram.

Speculation on root cause: Due to the fact that the patch files where still needed and that I needed to disable resizable bar in the bios, it's my guess that resizable bar causes issues with the merged PR approach.

I feel that this defect should remain open as I needed to patch the stubdom files. https://github.com/QubesOS/qubes-issues/issues/4321#issuecomment-865273085

securitycopper avatar Jul 28 '22 23:07 securitycopper

Could someone who has problems with https://github.com/QubesOS/qubes-vmm-xen-stubdom-linux/commit/be8896ba2fae2e377aa236df4407bc7dbdcec60b post entries for the passed through devices from the output of lspci -vv in dom0?

mati7337 avatar Jul 30 '22 10:07 mati7337

@mati7337, I've attached what you requested. I currently have resizable bar disabled, if you need it again with resizable bar enabled, will see your notification late tonight. lspci_vv_RX6900XT.txt

securitycopper avatar Jul 30 '22 12:07 securitycopper

@securitycopper Thanks. Yeah, I'll also need it in the non working state. All the PCI devices you've posted use memory at addresses bigger than 0xf0000000 (3.75GiB) which is bigger than qemu's default TOLUD, so in that case my patch shouldn't change anything and the stubdom init patch also shouldn't be needed. Did you check if it works with your current configuration but without the patch to the init?

mati7337 avatar Jul 30 '22 14:07 mati7337

@mati7337 I've completed the additional tested you requested with steps and logs detailed in file attach.

To summarize:

  • I validated that your PR works, that file 'xen-hvm-stubdom-linux' no longer needs to be patched. https://github.com/QubesOS/qubes-vmm-xen-stubdom-linux/commit/be8896ba2fae2e377aa236df4407bc7dbdcec60b
  • File xen-hvm-stubdom-linux-full still requires the patch.
  • Resizable bar doesn't work when enabled, however that feels out of scope for this issue. (lspci -vv with resizable bar enabled is part of attached)

@mati7337, since your PR fixed xen-hvm-stubdom-linux, would a similar change work on xen-hvm-stubdom-linux-full? If so, you might be able to fully resolve this issue that has been open since 2018. I am able to validate on my end. Thank you for the effort you've put in to resolve this issue and I apologize for the misunderstanding where I thought your PR would also fix xen-hvm-stubdom-linux-full.

It's also possible based on your comment that maybe I've never needed xen-hvm-stubdom-linux patched, but since I've always patched that first before patching xen-hvm-stubdom-linux-full, that I may have the false assumption that it was required. Unsure how to properly downgrade to before your patch to test that theory out.

Issue4321_TestingStepsAndLogs.txt

securitycopper avatar Jul 31 '22 02:07 securitycopper

@securitycopper That's interesting as the PCI addresses are all above 0xf0000000, so my patch shouldn't affect anything. Could you also check the contents of /proc/iomem in the working state from within a virtual machine? That's the file that gets read by my patch. You can also try downgrading using qubes-dom0-update --action=downgrade xen-hvm-stubdom-linux.

For now I won't be patching xen-hvm-stubdom-linux-full as this patch seems to have caused more problems than it solved. It works with my GPU (750 ti) but the lowest PCI BAR of that card is at 0xe0000000 and the patch seems to not work with some other GPUs. The interesting thing is that even though there are cards with addresses even at 0xd0000000 the temporary patch always sets TOLUD to 3.5GiB. Maybe checking if the lowest BAR is at an address lower than 0xf0000000 and if it is setting TOLUD to 3.5GiB would be the best solution.

mati7337 avatar Aug 04 '22 02:08 mati7337