WSL icon indicating copy to clipboard operation
WSL copied to clipboard

2.5.1 breaks /dev/dri detection - Hardware acceleration is dead

Open FlattusBlastus opened this issue 8 months ago • 22 comments

Windows Version

Windows version: 10.0.26100.3476

WSL Version

2.5.1

Are you using WSL 1 or WSL 2?

  • [x] WSL 2
  • [ ] WSL 1

Kernel Version

6.6.75

Distro Version

Docker Desktop - Alpine

Other Software

DD 4.39.0

Repro Steps

Install 2.5.1 BOOOM - Nextcloud AIO no longer boots (HTTP code 500) server error - error gathering device information while adding custom device "/dev/dri": no such file or directory

Expected Behavior

Everything works as usual including hardware acceleration

Actual Behavior

As reported - Failed to launch NC AIO because HW acceleration no longer works

Revert to 2.4.12 works

Diagnostic Logs

Just letting y'all know of the problem. Sorry cant work it with you.

FlattusBlastus avatar Mar 16 '25 05:03 FlattusBlastus

Logs are required for review from WSL team

If this a feature request, please reply with '/feature'. If this is a question, reply with '/question'. Otherwise please attach logs by following the instructions below, your issue will not be reviewed unless they are added. These logs will help us understand what is going on in your machine.

How to collect WSL logs

Download and execute collect-wsl-logs.ps1 in an administrative powershell prompt:

Invoke-WebRequest -UseBasicParsing "https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/collect-wsl-logs.ps1" -OutFile collect-wsl-logs.ps1
Set-ExecutionPolicy Bypass -Scope Process -Force
.\collect-wsl-logs.ps1

The script will output the path of the log file once done.

If this is a networking issue, please use collect-networking-logs.ps1, following the instructions here

Once completed please upload the output files to this Github issue.

Click here for more info on logging If you choose to email these logs instead of attaching to the bug, please send them to [email protected] with the number of the github issue in the subject, and in the message a link to your comment in the github issue and reply with '/emailed-logs'.

github-actions[bot] avatar Mar 16 '25 06:03 github-actions[bot]

@FlattusBlastus if you're missing /dev/dri is most likely that your vgem module (the Virtual GPU driver) didn't load.

zcobol avatar Mar 16 '25 16:03 zcobol

Happening to me also. With 2.5.1:

Image

after uninstalling and installing 2.4.12 it worked again

crramirez avatar Mar 18 '25 14:03 crramirez

vgem driver is not automatically loaded using WSL-2.5.1

Sample output on Ubuntu-25.04:

List modules after launch:

elsaco@tokyo:~$ lsmod
Module                  Size  Used by
intel_rapl_msr         16384  0
intel_rapl_common      32768  1 intel_rapl_msr
crc32c_intel           16384  0
sch_fq_codel           16384  1
configfs               53248  1
autofs4                45056  0
br_netfilter           28672  0
bridge                282624  1 br_netfilter
stp                    12288  1 bridge
llc                    12288  2 bridge,stp
ip_tables              28672  0
tun                    53248  0

Check if /dev/dri is present:

elsaco@tokyo:~$ ls /dev/dri/
ls: cannot access '/dev/dri/': No such file or directory

Check if vgem module is available:

elsaco@tokyo:~$ modinfo vgem
filename:       /lib/modules/6.6.75.1-microsoft-standard-WSL2/kernel/drivers/gpu/drm/vgem/vgem.ko
license:        GPL and additional rights
description:    Virtual GEM provider
author:         Intel Corporation
author:         Red Hat, Inc.
depends:        drm_shmem_helper
retpoline:      Y
intree:         Y
name:           vgem
vermagic:       6.6.75.1-microsoft-standard-WSL2 SMP preempt mod_unload modversions

Load module:

elsaco@tokyo:~$ sudo modprobe vgem

Check if device is added:

elsaco@tokyo:~$ ls -l /dev/dri/
total 0
drwxr-xr-x 2 root root         80 Mar 18 18:46 by-path
crw-rw---- 1 root video  226,   0 Mar 18 18:46 card0
crw-rw---- 1 root render 226, 128 Mar 18 18:46 renderD128

Host OS is Windows 10 and WSL-2.5.1. After loading the module the vainfo works on Nobel but not on Plucky:

vainfo-2.12.0 on Ubuntu Nobel:

elsaco@tokyo:~$ vainfo --display drm
libva info: VA-API version 1.20.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/d3d12_drv_video.so
libva info: Found init function __vaDriverInit_1_20
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.20 (libva 2.12.0)
vainfo: Driver version: Mesa Gallium driver 24.2.8-1ubuntu1~24.04.1 for D3D12 (NVIDIA RTX A4000)
vainfo: Supported profile and entrypoints
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSlice
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSlice
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointEncSlice
      VAProfileHEVCMain10             : VAEntrypointVLD
      VAProfileHEVCMain10             : VAEntrypointEncSlice
      VAProfileVP9Profile0            : VAEntrypointVLD
      VAProfileVP9Profile2            : VAEntrypointVLD
      VAProfileAV1Profile0            : VAEntrypointVLD
      VAProfileNone                   : VAEntrypointVideoProc

vainfo-2.22.0 on Ubuntu Plucky:

elsaco@tokyo:~$ vainfo --display drm
Trying display: drm
libva info: VA-API version 1.22.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/d3d12_drv_video.so
libva info: Found init function __vaDriverInit_1_22
libva error: /usr/lib/x86_64-linux-gnu/dri/d3d12_drv_video.so init failed
libva info: va_openDriver() returns 2
vaInitialize failed with error code 2 (resource allocation failed),exit

WSL info:

WSL version: 2.5.1.0
Kernel version: 6.6.75.1-1
WSLg version: 1.0.66
MSRDC version: 1.2.5716
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26100.1-240331-1435.ge-release
Windows version: 10.0.19045.5608

elsaco avatar Mar 19 '25 01:03 elsaco

To have the module loaded on boot add /etc/modules-load.d/vgem.conf containing the name of the driver(vgem in this case)

Adding it to /etc/modules is not recommended anymore:

# /etc/modules is obsolete and has been replaced by /etc/modules-load.d/.
# Please see modules-load.d(5) and modprobe.d(5) for details.
#
# Updating this file still works, but it is undocumented and unsupported.

zcobol avatar Mar 20 '25 05:03 zcobol

zcobol, should I get the Nextcloud AIO devs to include this?

FlattusBlastus avatar Mar 21 '25 15:03 FlattusBlastus

Going to try 2.5.4

FlattusBlastus avatar Mar 27 '25 06:03 FlattusBlastus

2.5.6 still broken

FlattusBlastus avatar Apr 09 '25 03:04 FlattusBlastus

I'm not super familiar with this driver, on "typical" Linux kernel is this module compiled in?

For some more context, with the 6.X kernel we now include many modules, but fewer are compiled in. We have a small number of modules that we load as part of boot, and vgem is not currently one of them (would be easy to add).

benhillis avatar Apr 09 '25 16:04 benhillis

The thing is that without it, the GPU video acceleration won't work out of the box as it does in WSL 2.4

crramirez avatar Apr 09 '25 16:04 crramirez

@crramirez - I understand that and I'm writing a fix now, but who typically loads this driver on a typical Linux distro? systemd, or is it usually compiled in?

benhillis avatar Apr 09 '25 16:04 benhillis

It's used in Nextcloud AIO, Immich, and other solutions.

On Wed, Apr 9, 2025, 9:32 AM Ben Hillis @.***> wrote:

@crramirez https://github.com/crramirez - I understand that and I'm writing a fix now, but who typically loads this driver on a typical Linux distro? systemd, or is it usually compiled in?

— Reply to this email directly, view it on GitHub https://github.com/microsoft/WSL/issues/12702#issuecomment-2790335226, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASVVEUFVHJH7MIR4WYWWI3L2YVDTRAVCNFSM6AAAAABZDIZPA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOJQGMZTKMRSGY . You are receiving this because you were mentioned.Message ID: @.***> benhillis left a comment (microsoft/WSL#12702) https://github.com/microsoft/WSL/issues/12702#issuecomment-2790335226

@crramirez https://github.com/crramirez - I understand that and I'm writing a fix now, but who typically loads this driver on a typical Linux distro? systemd, or is it usually compiled in?

— Reply to this email directly, view it on GitHub https://github.com/microsoft/WSL/issues/12702#issuecomment-2790335226, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASVVEUFVHJH7MIR4WYWWI3L2YVDTRAVCNFSM6AAAAABZDIZPA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOJQGMZTKMRSGY . You are receiving this because you were mentioned.Message ID: @.***>

FlattusBlastus avatar Apr 09 '25 17:04 FlattusBlastus

vainfo-2.22.0 on Ubuntu Plucky:

elsaco@tokyo:~$ vainfo --display drm
Trying display: drm
libva info: VA-API version 1.22.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/d3d12_drv_video.so
libva info: Found init function __vaDriverInit_1_22
libva error: /usr/lib/x86_64-linux-gnu/dri/d3d12_drv_video.so init failed
libva info: va_openDriver() returns 2
vaInitialize failed with error code 2 (resource allocation failed),exit

@elsaco - do you have any idea what would cause things to work with one version of Ubuntu and not another? Is it possible the d3d lib has a dependency on some kernel change?

benhillis avatar Apr 09 '25 18:04 benhillis

@benhillis Any word which version this fix is going in?

FlattusBlastus avatar Apr 13 '25 20:04 FlattusBlastus

Same issue here, after install WSL 2.5.6. I think the 6.6.x kernel with modules changed break this. After try sudo modprobe vgem , this problem can be solved.

Image

KylinDemons avatar Apr 14 '25 04:04 KylinDemons

Is there any chance this bug will ever be fixed?

erickvieira avatar May 30 '25 17:05 erickvieira

@benhillis as I know the service systemd-udevd is the one in charge for loading the module. But for any reason it is not loading the module in kernel 6.x

crramirez avatar Jun 02 '25 18:06 crramirez

still dead in 2.5.8

FlattusBlastus avatar Jun 04 '25 03:06 FlattusBlastus

@benhillis I am willing to do a live troubleshooting session whenever you are available

FlattusBlastus avatar Jun 04 '25 03:06 FlattusBlastus

echo "vgem" | sudo tee -a /etc/modules

Did not work for me. Still getting: (HTTP code 500) server error - error gathering device information while adding custom device "/dev/dri": no such file or directory

It happens quicker than the eye can see. Dies immediately.

Reverted back to 2.4.13.0

FlattusBlastus avatar Jun 04 '25 04:06 FlattusBlastus

GUESS WHAT PEEPS!!!! https://github.com/Nevuly/WSL2-Linux-Kernel-Rolling WORKS with just the kernel added. My .wslconfig

[boot]
systemd=true

[wsl2]
kernel=C:\\WSLKERNEL\\bzImage-x86_64

FlattusBlastus avatar Jun 04 '25 04:06 FlattusBlastus

Out of topic: Why do you have this [boot] systemd=true in your .wslconfig?, it belongs to /etc/wsl.conf

crramirez avatar Jun 04 '25 11:06 crramirez

We're facing the same issues on all our Nvidia machines. None of them open a hardware accelerated VA-API context.

We have tried:

  • WSL 2.5.10 with Ubuntu 24.04 and Ubuntu 22.04 (with additional kisak mesa drivers)
  • modprobe vgem
  • replacing various kernels, including Linux Scratchy 6.15.0-WSL2-STABLE+ #1 SMP PREEMPT_DYNAMIC Mon May 26 10:21:42 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux from @FlattusBlastus rolling kernel link

We can at least get nvidia OpenGL acceleration working by (in this case running blender):

# preparing VA-API
$ sudo add-apt-repository ppa:oibaf/graphics-drivers
$ sudo apt-get update && sudo apt-get upgrade
$ sudo apt-get install ppa-purge
$ sudo ppa-purge ppa:oibaf/graphics-drivers
$ sudo apt-get update && sudo apt-get upgrade
$ sudo apt-get mesa-va-drivers
$ sudo usermod -a -G video $USER

# add vgem as kernel module on wsl start
$ echo "vgem" | tee -a /etc/modules

# adding environment variables globally to load the correct drivers and make mesa select the correct device
$ sudo -i
$ cat <<EOL > /etc/profile.d/nvidia-driver.sh
export GALLIUM_DRIVER=d3d12
export LIBVA_DRIVER_NAME=d3d12
export LD_LIBRARY_PATH=/usr/lib/wsl/lib
EOL
$ exit

# testing
$ glxinfo -B
name of display: :0
display: :0  screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: Microsoft Corporation (0xffffffff)
    Device: D3D12 (NVIDIA GeForce RTX 4070) (0xffffffff)
$ vainfo --display drm --device /dev/dri/card0
libva info: VA-API version 1.20.0
libva info: User environment variable requested driver 'd3d12'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/d3d12_drv_video.so
libva info: Found init function __vaDriverInit_1_20
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.20 (libva 2.12.0)
vainfo: Driver version: Mesa Gallium driver 25.0.7-0ubuntu0.24.04.1 for D3D12 (NVIDIA GeForce RTX 4070)
vainfo: Supported profile and entrypoints
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSlice
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSlice
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointEncSlice
      VAProfileHEVCMain10             : VAEntrypointVLD
      VAProfileHEVCMain10             : VAEntrypointEncSlice
      VAProfileVP9Profile0            : VAEntrypointVLD
      VAProfileVP9Profile2            : VAEntrypointVLD
      VAProfileAV1Profile0            : VAEntrypointVLD
      VAProfileNone                   : VAEntrypointVideoProc

This allows us to run accelerated environments currently.


The system environment is:

$ hostnamectl
 Static hostname: host
       Icon name: computer-container
         Chassis: container ☐
      Machine ID: 99f6175e563544e1b298f799c6b66559
         Boot ID: 0a1dcf3c094a47d4b6381d7e3f044c9d
  Virtualization: wsl
Operating System: Ubuntu 24.04.3 LTS
          Kernel: Linux 6.6.87.2-microsoft-standard-WSL2
    Architecture: x86-64

mio-moto avatar Sep 02 '25 14:09 mio-moto

@mio-moto I see some things in your configuration that you should take a look at:

  1. LD_LIBRARY_PATH: I have NVIDIA and WSL 2.5.10, MESA acceleration and VA-API works well for me, including mpv and ollama that use CUDA, and I don't have to set this variable. I recommend you to unset it.
  2. vainfo it is complaining about XDG_RUNTIME_DIR, it should have to have a value automatically assigned. On systemd this value should be /run/user/1000/ and without it should be /mnt/wslg/runtime-dir. In any case for VA-API systemd must be activated as stated in: https://devblogs.microsoft.com/commandline/d3d12-gpu-video-acceleration-in-the-windows-subsystem-for-linux-now-available/
  3. I don't see in your configuration the variable $LIBVA_DRIVER_NAME. It should have the value d3d12
  4. Do not run vainfo with sudo, if you face device not found errors, be sure that systemd is enabled and your user belongs to the group video

In my system I only need to modprobe vgem for everything to work.

crramirez avatar Sep 02 '25 14:09 crramirez

@mio-moto I see some things in your configuration that you should take a look at:

  1. LD_LIBRARY_PATH: I have NVIDIA and WSL 2.5.10, MESA acceleration and VA-API works well for me, including mpv and ollama that use CUDA, and I don't have to set this variable. I recommend you to unset it.

LD_LIBRARY_PATH is required to load the appropriate drivers required for Blender to bind the OpenGL device. ollama uses CUDA, which has nothing (? I think?) to do with VA-API.

  1. vainfo it is complaining about XDG_RUNTIME_DIR, it should have to have a value automatically assigned. On systemd this value should be /run/user/1000/ and without it should be /mnt/wslg/runtime-dir. In any case for VA-API systemd must be activated as stated in: https://devblogs.microsoft.com/commandline/d3d12-gpu-video-acceleration-in-the-windows-subsystem-for-linux-now-available/
  2. I don't see in your configuration the variable $LIBVA_DRIVER_NAME. It should have the value d3d12
  3. Do not run vainfo with sudo, if you face device not found errors, be sure that systemd is enabled and your user belongs to the group video

That seems to shed some light in which combinations this is supposed to work.

$ sudo usermod -a -G video $USER
$ LIBVA_DRIVER_NAME=d3d12 vainfo --display drm --device /dev/dri/card0
libva info: VA-API version 1.20.0
libva info: User environment variable requested driver 'd3d12'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/d3d12_drv_video.so
libva info: Found init function __vaDriverInit_1_20
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.20 (libva 2.12.0)
vainfo: Driver version: Mesa Gallium driver 25.0.7-0ubuntu0.24.04.1 for D3D12 (NVIDIA GeForce RTX 3090 Ti)
vainfo: Supported profile and entrypoints
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSlice
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSlice
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointEncSlice
      VAProfileHEVCMain10             : VAEntrypointVLD
      VAProfileHEVCMain10             : VAEntrypointEncSlice
      VAProfileVP9Profile0            : VAEntrypointVLD
      VAProfileVP9Profile2            : VAEntrypointVLD
      VAProfileAV1Profile0            : VAEntrypointVLD
      VAProfileNone                   : VAEntrypointVideoProc

# therefore
$ cat <<EOL > /etc/profile.d/nvidia-driver.sh
export GALLIUM_DRIVER=d3d12
export LIBVA_DRIVER_NAME=d3d12
export LD_LIBRARY_PATH=/usr/lib/wsl/lib
EOL

Thanks so far.

mio-moto avatar Sep 02 '25 15:09 mio-moto

I'll take a note about LD_LIBRARY_PATH=/usr/lib/wsl/lib needed by blender.

Thank you

crramirez avatar Sep 02 '25 15:09 crramirez