nvidia-vaapi-driver icon indicating copy to clipboard operation
nvidia-vaapi-driver copied to clipboard

DRM device recognition error on dual GPU devices

Open acdcbyl opened this issue 1 year ago • 12 comments

Thank you all for your efforts! My laptop consists of AMD integrated graphics and NVIDIA discrete graphics, and I need NVIDIA's dGPU to hard decode for my firefox, so I'm trying to use vainfo to check that the vaapi is working, and I think I'm running into a problem. Here are the details:

  • System Information
Linux: Arch linux
Kernel: Linux 6.9.9-arch1-1
Desktop: Hyprland 
Cpu: AMD Ryzen 7 4800H (16) @ 2.90 GHz
iGpu: AMD Radeon Vega Series / Radeon Vega Mobile Series [Integrated]
dGpu: NVIDIA GeForce RTX 2060 Mobile [Discrete]
Nvidia Driver : nvidia 
Driver Version: 555.58.02
  • Execute nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.58.02              Driver Version: 555.58.02      CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2060        Off |   00000000:01:00.0 Off |                  N/A |
| N/A   52C    P8              4W /   90W |       9MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A    261741      G   /usr/lib/Xorg                                   4MiB |
|    0   N/A  N/A    261815      G   Hyprland                                        1MiB |
+-----------------------------------------------------------------------------------------+
  • Execute vainfo
Trying display: wayland
libva error: /usr/lib/dri/nvidia_drv_video.so init failed
vaInitialize failed with error code 1 (operation failed),exit
  • Add variable NVD_LOG=1 to execute vainfo to view the information
Trying display: wayland
      5452.351155764 [555255-555255] ../nvidia-vaapi-driver-0.0.12/src/vabackend.c:2188       __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 40
      5452.351167125 [555255-555255] ../nvidia-vaapi-driver-0.0.12/src/vabackend.c:2197       __vaDriverInit_1_0 Now have 0 (0 max) instances
      5452.351169590 [555255-555255] ../nvidia-vaapi-driver-0.0.12/src/vabackend.c:2223       __vaDriverInit_1_0 Selecting Direct backend
      5452.359507038 [555255-555255] ../nvidia-vaapi-driver-0.0.12/src/backend-common.c:  31            isNvidiaDrmFd Invalid driver for DRM device: amdgpu
      5452.359519021 [555255-555255] ../nvidia-vaapi-driver-0.0.12/src/vabackend.c:2248       __vaDriverInit_1_0 Exporter failed
libva error: /usr/lib/dri/nvidia_drv_video.so init failed
vaInitialize failed with error code 1 (operation failed),exit

I'm not quite sure the reason for this, it seems amd's iGpu is recognised as a DRM device. It's a bit mind boggling.

  • Execute vainfo for the specified DRM device vainfo --display drm --device /dev/dri/renderD128
Trying display: drm
vainfo: VA-API version: 1.21 (libva 2.22.0)
vainfo: Driver version: VA-API NVDEC driver [direct backend]
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            :	VAEntrypointVLD
      VAProfileMPEG2Main              :	VAEntrypointVLD
      VAProfileVC1Simple              :	VAEntrypointVLD
      VAProfileVC1Main                :	VAEntrypointVLD
      VAProfileVC1Advanced            :	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointVLD
      VAProfileH264High               :	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
      VAProfileHEVCMain               :	VAEntrypointVLD
      VAProfileVP8Version0_3          :	VAEntrypointVLD
      VAProfileVP9Profile0            :	VAEntrypointVLD
      VAProfileHEVCMain10             :	VAEntrypointVLD
      VAProfileHEVCMain12             :	VAEntrypointVLD
      VAProfileVP9Profile2            :	VAEntrypointVLD
      VAProfileHEVCMain444            :	VAEntrypointVLD
      VAProfileHEVCMain444_10         :	VAEntrypointVLD
      VAProfileHEVCMain444_12         :	VAEntrypointVLD

Obviously, it worked. Now the big question becomes how to set the DRM device to /dev/dri/renderD128. This seems like a strange issue, it seems that /dev/dri/renderD128 can be set as the default DRM device via an environment variable. But I failed, I tried WLR_DRM_DEVICES (which caused Hyprland to crash), MOZ_DRM_DEVICE (which didn't seem to have any effect,Firefox still doesn't try to hard-decode, even though I've set it up.).

acdcbyl avatar Jul 18 '24 03:07 acdcbyl

I am noticing a similar issue, but I only have one GPU in my system.

vainfo
Trying display: wayland
Trying display: x11
libva error: vaGetDriverNames() failed with unknown libva error
vainfo: VA-API version: 1.21 (libva 2.22.0)
vainfo: Driver version: VA-API NVDEC driver [direct backend]
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileVP8Version0_3          : VAEntrypointVLD
      VAProfileVP9Profile0            : VAEntrypointVLD
      VAProfileHEVCMain10             : VAEntrypointVLD
      VAProfileHEVCMain12             : VAEntrypointVLD
      VAProfileVP9Profile2            : VAEntrypointVLD
      VAProfileHEVCMain444            : VAEntrypointVLD
      VAProfileHEVCMain444_10         : VAEntrypointVLD
      VAProfileHEVCMain444_12         : VAEntrypointVLD

with NVD_LOG=1

libva error: vaGetDriverNames() failed with unknown libva error
      2395.728171647 [39265-39265] ../src/vabackend.c:2187       __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 10
      2395.728183019 [39265-39265] ../src/vabackend.c:2196       __vaDriverInit_1_0 Now have 0 (0 max) instances
      2395.728185774 [39265-39265] ../src/vabackend.c:2222       __vaDriverInit_1_0 Selecting Direct backend
      2395.738263002 [39265-39265] ../src/direct/direct-export-buf.c:  68      direct_initExporter Searching for GPU: 0 0 128
      2395.738286817 [39265-39265] ../src/direct/direct-export-buf.c:  90      direct_initExporter Found NVIDIA GPU 0 at /dev/dri/renderD128
      2395.738289693 [39265-39265] ../src/direct/nv-driver.c: 267            init_nvdriver Initing nvdriver...
      2395.738315362 [39265-39265] ../src/direct/nv-driver.c: 285            init_nvdriver NVIDIA kernel driver version: 555.58.02, major version: 555, minor version: 58
      2395.738319249 [39265-39265] ../src/direct/nv-driver.c: 292            init_nvdriver Got dev info: 4200 1 2 6

with vainfo --display drm --device /dev/dri/renderD128

vainfo: VA-API version: 1.21 (libva 2.22.0)
vainfo: Driver version: VA-API NVDEC driver [direct backend]
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileVP8Version0_3          : VAEntrypointVLD
      VAProfileVP9Profile0            : VAEntrypointVLD
      VAProfileHEVCMain10             : VAEntrypointVLD
      VAProfileHEVCMain12             : VAEntrypointVLD
      VAProfileVP9Profile2            : VAEntrypointVLD
      VAProfileHEVCMain444            : VAEntrypointVLD
      VAProfileHEVCMain444_10         : VAEntrypointVLD
      VAProfileHEVCMain444_12         : VAEntrypointVLD

duncanyoyo1 avatar Jul 20 '24 17:07 duncanyoyo1

I have this issue too:

❯  NVD_LOG=1 LIBVA_DRIVER_NAME=nvidia NVD_BACKEND=direct vainfo
Trying display: wayland
libva info: VA-API version 1.22.0
libva info: User environment variable requested driver 'nvidia'
libva info: Trying to open /run/opengl-driver/lib/dri/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_1_0
       957.721601938 [19427-19427] ../src/vabackend.c:2188       __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 40
       957.721607989 [19427-19427] ../src/vabackend.c:2197       __vaDriverInit_1_0 Now have 0 (0 max) instances
       957.721610895 [19427-19427] ../src/vabackend.c:2223       __vaDriverInit_1_0 Selecting Direct backend
       957.732369563 [19427-19427] ../src/backend-common.c:  31            isNvidiaDrmFd Invalid driver for DRM device: amdgpu
       957.732378340 [19427-19427] ../src/vabackend.c:2248       __vaDriverInit_1_0 Exporter failed
libva error: /run/opengl-driver/lib/dri/nvidia_drv_video.so init failed
libva info: va_openDriver() returns 1
vaInitialize failed with error code 1 (operation failed),exit

Fixed when: NVD_LOG=1 vainfo --display drm --device /dev/dri/by-path/pci-0000:01:00.0-render

Trying display: drm
libva info: VA-API version 1.22.0
libva info: User environment variable requested driver 'nvidia'
libva info: Trying to open /run/opengl-driver/lib/dri/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_1_0
      4232.998322754 [42848-42848] ../src/vabackend.c:2188       __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 31
      4232.998327463 [42848-42848] ../src/vabackend.c:2197       __vaDriverInit_1_0 Now have 0 (0 max) instances
      4232.998330238 [42848-42848] ../src/vabackend.c:2223       __vaDriverInit_1_0 Selecting Direct backend
      4233.005616676 [42848-42848] ../src/direct/nv-driver.c: 267            init_nvdriver Initing nvdriver...
      4233.005647463 [42848-42848] ../src/direct/nv-driver.c: 285            init_nvdriver NVIDIA kernel driver version: 555.58.02, major version: 555, minor version: 58
      4233.005651551 [42848-42848] ../src/direct/nv-driver.c: 292            init_nvdriver Got dev info: 100 1 2 6
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.22 (libva 2.22.0)
vainfo: Driver version: VA-API NVDEC driver [direct backend]
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileVP8Version0_3          : VAEntrypointVLD
      VAProfileVP9Profile0            : VAEntrypointVLD
      VAProfileAV1Profile0            : VAEntrypointVLD
      VAProfileHEVCMain10             : VAEntrypointVLD
      VAProfileHEVCMain12             : VAEntrypointVLD
      VAProfileVP9Profile2            : VAEntrypointVLD
      VAProfileHEVCMain444            : VAEntrypointVLD
      VAProfileHEVCMain444_10         : VAEntrypointVLD
      VAProfileHEVCMain444_12         : VAEntrypointVLD
      4233.247716911 [42848-42848] ../src/vabackend.c:2098              nvTerminate Terminating 0x84a8e0
      4233.248419924 [42848-42848] ../src/vabackend.c:2112              nvTerminate Now have 0 (0 max) instances

caniko avatar Aug 11 '24 05:08 caniko

Unfortunately multiple GPUs can pose a problem. Firefox seems to only want to run on the default device, and if that device isn't the NVIDIA one, then even if you set the correct NVD_GPU settings, Firefox won't be able to import the frames that the driver exports.

elFarto avatar Sep 09 '24 11:09 elFarto

I actually got it to work.

It was very hard, but I did it:

__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia VDPAU_DRIVER=nvidia LIBVA_DRIVER_NAME=nvidia VAAPI_DEVICE=/dev/dri/by-path/pci-0000:01:00.0-render MOZ_DISABLE_RDD_SANDBOX=1 NVD_BACKEND=direct floorp

I use floorp, firefox fork

caniko avatar Sep 09 '24 14:09 caniko

It's working for me, too.

I actually got it to work.

It was very hard, but I did it:

__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia VDPAU_DRIVER=nvidia LIBVA_DRIVER_NAME=nvidia VAAPI_DEVICE=/dev/dri/by-path/pci-0000:01:00.0-render MOZ_DISABLE_RDD_SANDBOX=1 NVD_BACKEND=direct floorp

I use floorp, firefox fork

acdcbyl avatar Oct 01 '24 06:10 acdcbyl

I actually got it to work.

It was very hard, but I did it:

__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia VDPAU_DRIVER=nvidia LIBVA_DRIVER_NAME=nvidia VAAPI_DEVICE=/dev/dri/by-path/pci-0000:01:00.0-render MOZ_DISABLE_RDD_SANDBOX=1 NVD_BACKEND=direct floorp

I use floorp, firefox fork

It doesn't work for me, sadly, it still tries to use my other DRM device:

❯ __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia VDPAU_DRIVER=nvidia LIBVA_DRIVER_NAME=nvidia VAAPI_DEVICE=/dev/dri/by-path/pci-0000:01:00.0-render MOZ_DISABLE_RDD_SANDBOX=1 NVD_BACKEND=direct NVD_LOG=1 vainfo
Trying display: wayland
       544.630667499 [2759-2759] ../nvidia-vaapi-driver-0.0.12/src/vabackend.c:2188       __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 40
       544.630683051 [2759-2759] ../nvidia-vaapi-driver-0.0.12/src/vabackend.c:2197       __vaDriverInit_1_0 Now have 0 (0 max) instances
       544.630710245 [2759-2759] ../nvidia-vaapi-driver-0.0.12/src/vabackend.c:2223       __vaDriverInit_1_0 Selecting Direct backend
       544.695960162 [2759-2759] ../nvidia-vaapi-driver-0.0.12/src/backend-common.c:  31            isNvidiaDrmFd Invalid driver for DRM device: i915
       544.695993021 [2759-2759] ../nvidia-vaapi-driver-0.0.12/src/vabackend.c:2248       __vaDriverInit_1_0 Exporter failed
libva error: /usr/lib/dri/nvidia_drv_video.so init failed
vaInitialize failed with error code 1 (operation failed),exit

AmmoniumX avatar Oct 03 '24 16:10 AmmoniumX

I actually got it to work. It was very hard, but I did it:

__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia VDPAU_DRIVER=nvidia LIBVA_DRIVER_NAME=nvidia VAAPI_DEVICE=/dev/dri/by-path/pci-0000:01:00.0-render MOZ_DISABLE_RDD_SANDBOX=1 NVD_BACKEND=direct floorp

I use floorp, firefox fork

It doesn't work for me, sadly, it still tries to use my other DRM device:

❯ __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia VDPAU_DRIVER=nvidia LIBVA_DRIVER_NAME=nvidia VAAPI_DEVICE=/dev/dri/by-path/pci-0000:01:00.0-render MOZ_DISABLE_RDD_SANDBOX=1 NVD_BACKEND=direct NVD_LOG=1 vainfo
Trying display: wayland
       544.630667499 [2759-2759] ../nvidia-vaapi-driver-0.0.12/src/vabackend.c:2188       __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 40
       544.630683051 [2759-2759] ../nvidia-vaapi-driver-0.0.12/src/vabackend.c:2197       __vaDriverInit_1_0 Now have 0 (0 max) instances
       544.630710245 [2759-2759] ../nvidia-vaapi-driver-0.0.12/src/vabackend.c:2223       __vaDriverInit_1_0 Selecting Direct backend
       544.695960162 [2759-2759] ../nvidia-vaapi-driver-0.0.12/src/backend-common.c:  31            isNvidiaDrmFd Invalid driver for DRM device: i915
       544.695993021 [2759-2759] ../nvidia-vaapi-driver-0.0.12/src/vabackend.c:2248       __vaDriverInit_1_0 Exporter failed
libva error: /usr/lib/dri/nvidia_drv_video.so init failed
vaInitialize failed with error code 1 (operation failed),exit

Here's the thing, vainfo still shows the wrong drm device, but firefox can hardware accelerate after adding the MOZ_DRM_DEVICE and __NV_PRIME_RENDER_OFFLOAD environment variables. This is my method, you can try it.

acdcbyl avatar Oct 04 '24 10:10 acdcbyl

Here's the thing, vainfo still shows the wrong drm device, but firefox can hardware accelerate after adding the MOZ_DRM_DEVICE and __NV_PRIME_RENDER_OFFLOAD environment variables. This is my method, you can try it.

I have an intel iGPU laptop with an Nvidia eGPU and I use Debian trixie OS with X11. This driver works perfectly fine if I start X directly on eGPU, but I usually prefer to have my X on iGPU and offload stuff to eGPU, so that I can easily unplug it if I need to move. I tried all the env vars mentioned here, including MOZ_DRM_DEVICE, but unfortunately I haven't been able to get it working: I get isNvidiaDrmFd Invalid driver for DRM device: i915 no matter what :(

Here is my exact command line:

NVD_LOG=~/nvvaapi.log \
MOZ_DISABLE_RDD_SANDBOX=1 \
LIBVA_DRIVER_NAME=nvidia \
__NV_PRIME_RENDER_OFFLOAD=1 \
__GLX_VENDOR_LIBRARY_NAME=nvidia \
NVD_BACKEND=direct \
VDPAU_DRIVER=nvidia \
VAAPI_DEVICE=/dev/dri/renderD129 \
MOZ_DRM_DEVICE=/dev/dri/renderD129 \
MOZ_X11_EGL=1 \
firefox-esr

Here is my device info:

morgwai@morgwai-xps13:/dev/dri$ ls -l
total 0
drwxr-xr-x  2 root root        120 Jan 28 14:18 by-path
crw-rw----+ 1 root video  226,   0 Jan 28 21:25 card0
crw-rw----+ 1 root video  226,   1 Jan 28 21:25 card1
crw-rw----+ 1 root render 226, 128 Jan 28 14:14 renderD128
crw-rw----+ 1 root render 226, 129 Jan 28 14:18 renderD129
morgwai@morgwai-xps13:/dev/dri$ ls -l by-path/
total 0
lrwxrwxrwx 1 root root  8 Jan 28 14:14 pci-0000:00:02.0-card -> ../card0
lrwxrwxrwx 1 root root 13 Jan 28 14:14 pci-0000:00:02.0-render -> ../renderD128
lrwxrwxrwx 1 root root  8 Jan 28 14:18 pci-0000:0b:00.0-card -> ../card1
lrwxrwxrwx 1 root root 13 Jan 28 14:18 pci-0000:0b:00.0-render -> ../renderD129
morgwai@morgwai-xps13:/dev/dri$ lspci |grep VGA
00:02.0 VGA compatible controller: Intel Corporation Iris Plus Graphics 640 (rev 06)
0b:00.0 VGA compatible controller: NVIDIA Corporation GA102 [GeForce RTX 3090] (rev a1)

In Firefox's config I have media.ffmpeg.vaapi.enabled, media.rdd-ffmpeg.enabled, gfx.x11-egl.force-enabled and widget.dmabuf.force-enabled all set to true. My firefox version is 128.6.0esr

If someone has any more ideas how to make it work, I'd be grateful! :)

morgwai avatar Jan 29 '25 12:01 morgwai

@morgwai My setup looks basically the same, except my Nvidia in onboard (Quadro T2000 Mobile), and I have exactly the same issue. I'm using NixOS

My full config is here ( not that this will be meaningful for Debian ) https://github.com/randomizedcoder/nixos/tree/main/laptops/t

randomizedcoder avatar Jan 31 '25 19:01 randomizedcoder

@morgwai

Blacklisting i915 definitely make my Nvidia experience better, although this option might not work for you.

Tracking in this thread https://discourse.nixos.org/t/nvidia-open-breaks-hardware-acceleration/58770/

randomizedcoder avatar Jan 31 '25 22:01 randomizedcoder

@randomizedcoder, blacklisting i915 defeats the whole purpose ;-] Second, having i915 loaded does not break Nvidia hardware video decoding for me (it works fine if only X is started on the Nvidia regardless of i915 being loaded). My problem is with Firefox choosing a wrong device despite of having all the env vars defined similarly to @acdcbyl .

Having written the above actually made me realize I should ask this: @acdcbyl, which version of Firefox are you using?

Also, there's one more interesting detail regarding my setup that I've recently noticed: I don't have drm display available in vainfo:

morgwai@morgwai-xps13:~$ vainfo --display help
Available displays:
  wayland
  x11
morgwai@morgwai-xps13:~$ vainfo --display drm --device /dev/dri/renderD129 
error: failed to initialize display 'drm'

This made me think: maybe I lack some "drm abstraction layer" that actually allows Firefox to choose a drm device?

morgwai avatar Feb 01 '25 07:02 morgwai

I've just tried Firefox 135: same issue.

morgwai avatar Feb 04 '25 15:02 morgwai