MobilePassThrough
MobilePassThrough copied to clipboard
parameter 'x-pci-stub-device-id' expects an int64 value or range
After running sudo ./start-vm.sh and exiting once, trying to start the vm again using sudo ./start-vm.sh results in the following error:
qemu-system-x86_64: -device vfio-pci,host=01:00.0,bus=root.1,addr=00.0,x-pci-sub-device-id=0x,x-pci-sub-vendor-id=0x,multifunction=on,romfile=/home/samkg/Documents/MobilePassThrough/vm-files/vbios-roms/vbios.rom: Parameter 'x-pci-sub-device-id' expects an int64 value or range
Doing a sudo reboot seems to fix it, but it is annoying to not be able to start up the vm multiple times in succession.
Is there any known fix for this?
What is your output of sudo lspci -vvv
after you get the error?
Do you have Bumblebee installed and if so, what is the output of sudo optirun echo "Hello"
?
Output before error: lspci_1.txt
After error: lspci_2.txt
Weird thing about it - the LnkSta Speed is downgraded from 8GT/s to 2.5 GT/s . is this normal?
sudo optirun echo "Hello"
prints out Hello
as expected
It says !!! Unknown header type 7f
for your Nvidia GPU in both files. Something is wrong there.
I'm not sure if the LnkSta is a problem or if it's normal. Maybe I can check on my device when I have some more time.
Can you show me the output of the following:
GPU_PCI_ADDRESS=01:00.0
GPU_IDS=$(optirun lspci -n -s "${GPU_PCI_ADDRESS}" | grep -oP "\w+:\w+" | tail -1)
GPU_VENDOR_ID=$(echo "${GPU_IDS}" | cut -d ":" -f1)
GPU_DEVICE_ID=$(echo "${GPU_IDS}" | cut -d ":" -f2)
GPU_SS_IDS=$(optirun lspci -vnn -d "${GPU_IDS}" | grep "Subsystem:" | grep -oP "\w+:\w+")
GPU_SS_VENDOR_ID=$(echo "${GPU_SS_IDS}" | cut -d ":" -f1)
GPU_SS_DEVICE_ID=$(echo "${GPU_SS_IDS}" | cut -d ":" -f2)
echo "GPU_PCI_ADDRESS: ${GPU_PCI_ADDRESS}"
echo "GPU_IDS: $GPU_IDS"
echo "GPU_VENDOR_ID: $GPU_VENDOR_ID"
echo "GPU_DEVICE_ID: $GPU_DEVICE_ID"
echo "GPU_SS_IDS: $GPU_SS_IDS"
echo "GPU_SS_VENDOR_ID: $GPU_SS_VENDOR_ID"
echo "GPU_SS_DEVICE_ID: $GPU_SS_DEVICE_ID"
and also of:
GPU_PCI_ADDRESS=01:00.0
if sudo which optirun &> /dev/null && sudo optirun echo>/dev/null ; then
USE_BUMBLEBEE=true
OPTIRUN_PREFIX="optirun "
else
USE_BUMBLEBEE=false
OPTIRUN_PREFIX=""
fi
GPU_IDS=$(sudo ${OPTIRUN_PREFIX}lspci -n -s "${GPU_PCI_ADDRESS}" | grep -oP "\w+:\w+" | tail -1)
GPU_VENDOR_ID=$(echo "${GPU_IDS}" | cut -d ":" -f1)
GPU_DEVICE_ID=$(echo "${GPU_IDS}" | cut -d ":" -f2)
GPU_SS_IDS=$(sudo ${OPTIRUN_PREFIX}lspci -vnn -d "${GPU_IDS}" | grep "Subsystem:" | grep -oP "\w+:\w+")
GPU_SS_VENDOR_ID=$(echo "${GPU_SS_IDS}" | cut -d ":" -f1)
GPU_SS_DEVICE_ID=$(echo "${GPU_SS_IDS}" | cut -d ":" -f2)
echo "GPU_PCI_ADDRESS: ${GPU_PCI_ADDRESS}"
echo "GPU_IDS: $GPU_IDS"
echo "GPU_VENDOR_ID: $GPU_VENDOR_ID"
echo "GPU_DEVICE_ID: $GPU_DEVICE_ID"
echo "GPU_SS_IDS: $GPU_SS_IDS"
echo "GPU_SS_VENDOR_ID: $GPU_SS_VENDOR_ID"
echo "GPU_SS_DEVICE_ID: $GPU_SS_DEVICE_ID"
echo "OPTIRUN_PREFIX: $OPTIRUN_PREFIX"
echo "LSPCI_OUTPUT: $(sudo ${OPTIRUN_PREFIX}lspci -vnn -d ${GPU_IDS})"
Did you run these before getting the error? Because the output looks perfectly fine. Can you run these after getting the error?
The problem is that this line is missing:
Subsystem: Lenovo Device [17aa:39f5]
or at least that is the symptom...
Because of that, the script can't extract the subsystem vendor id and the subsystem device id which are both required in this line.
I am not sure why the Subsystem line is missing. Maybe there are deeper issues with your system? Have you checked dmesg
for GPU related errors?
I have only tested the script on a fresh installation of Fedora 29 btw. Maybe you made some changes to the system that my scripts can't compensate for yet.
Edit:
As a dirty workaround you could try to set the subsystem IDs manually by replacing
GPU_SS_IDS=$(optirun lspci -vnn -d "${GPU_IDS}" | grep "Subsystem:" | grep -oP "\w+:\w+")
with
GPU_SS_IDS="17aa:39f5"
I have just pushed a major update, adding support for Fedora 30 and some other changes. Maybe you can give it another shot now.
Thanks! Unfortunately, I no longer have this laptop (ran into some issues), and instead have another one without an iGPU.
I don't think it would be possible for me to test
Hello,
So I ran into this same issue, this happened also on my Lenovo device (P50), what I saw is that if the card is reset or turned on/off or passed through then released, the subsystem line disappears until you reboot the device. One of the solutions is to check if the subsystem even exist, when running lspci, and then saving those values for that computer in a file (you can try matching those ids with the uuid of the device in case there are some hardware changes in the future).
Edit: turning off/on the card with bbswitch might bring back that value, but not always.
What GPU does your laptop have?
A Quadro M2000M. I found that when passing the dGPU and then releasing it makes the subsystem disappear.
GPU_PCI_ADDRESS: 01:00.0
GPU_IDS: 10de:13b0
GPU_VENDOR_ID: 10de
GPU_DEVICE_ID: 13b0
GPU_SS_IDS:
GPU_SS_VENDOR_ID:
GPU_SS_DEVICE_ID:
OPTIRUN_PREFIX:
LSPCI_OUTPUT: 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GLM [Quadro M2000M] [10de:13b0] (rev a2) (prog-if 00 [VGA controller])
Flags: fast devsel, IRQ 16
Memory at d3000000 (32-bit, non-prefetchable) [size=16M]
Memory at c0000000 (64-bit, prefetchable) [size=256M]
Memory at d0000000 (64-bit, prefetchable) [size=32M]
I/O ports at 4000 [size=128]
Expansion ROM at d4080000 [disabled] [size=512K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Capabilities: [250] Latency Tolerance Reporting
Capabilities: [258] L1 PM Substates
Capabilities: [128] Power Budgeting <?>
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] Secondary PCI Express
Kernel modules: nvidiafb, nouveau
Here is how it looks like post-passing it and releasing it. (Ubuntu here)
However with my tests with bbswitch, if I turn off and then on the card with bbswitch, I do get the subsystemids back, but I cannot pass it to the vm again somehow (this also breaks HDMI Audio device as it will stay disabled until I reboot or reset the pcie device, which will also result in a loss of subsystemid).
So what kills subsystemid from showing: (from my experience)
- passing the dGPU and releasing it (after vm is off)
- resetting the pcie device (/sys/bus/pci/{ID}/reset or remove then rescan)
As I only have this laptop, I do not know if any other laptop have this issue, so far in this thread 2 Lenovo laptops show the same symptoms.
Okay in that case I'm not sure. If it was an AMD GPU I would have said that it might be the reset bug in which case the vendor-reset project may have helped. But I suppose it doesn't apply to Nvidia GPUs.
This issue is hard for me to debug as I don't have a laptop with a Quadro. But if we can somehow pin-point it further we could jump on the related mailing list and ask the developers themselves.