KasmVNC icon indicating copy to clipboard operation
KasmVNC copied to clipboard

Run a container(ubuntu 20.04) with GPU accleration, the container use OpenGL renderer: llvmpipe (LLVM 15.0.7, 256 bits) but not NVIDIA GeForce RTX 3060/PCIe/SSE2

Open lightbot21 opened this issue 1 year ago • 8 comments

I use sudo docker run -it -p 6901:6901 -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=all --gpus="all" --name nvidia-test -e VNC_PW=123456 kasm-lrae nvidia-smi to create a container I hope the container can use my Nvidia GPU to render gazebo which requires OpenGL. But I use the glxinfo -B to check the container, it show the follow response: llvmpipe It seems that the container hasn't use GPU to render the OpenGL. The expected output may be this(according KasmVNC GPU Acceleration ): glxinfo The nvidia-smi both give the expected outcome in host system and container, which may indicate that I have installed Nvidia driver successfully. nvidia-smi

Other information: host system: Ubuntu20.04 The image kasm-lrae base bases from kasmweb/ubuntu-focal-desktop:1.16.0-rolling-daily The content of daemon.json is as follow: { "registry-mirrors": [ "https://reg-mirror.qiniu.com/" ], "runtimes": { "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } }, "default-runtime": "nvidia" }

lightbot21 avatar Dec 02 '24 14:12 lightbot21

Same!! Expect your solution

codernew007 avatar Jan 07 '25 07:01 codernew007

I do not have dri interface available... so I use this to get into a console with offloading.

#!/usr/bin/bash

env __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only VK_DRIVER_FILES=/usr/share/vulkan/icd.d/nvidia_icd.json LIBVA_DRIVER_NAME=nvidia bash

You can create the json file like this:

cat /usr/share/vulkan/icd.d/virtio_icd.x86_64.json | jq '.ICD.library_path="/usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0"' > /usr/share/vulkan/icd.d/nvidia_icd.json

VK_ variables and file are only required if you need vulkan

Works for me for openSCAD, blender, minetest ... except for the last one I have some other nasty mouse issues (#310)

hochbit avatar Jan 15 '25 16:01 hochbit

@Maipengfei @codernew007

I suggest combining VirtualGL to make things work. Check https://github.com/selkies-project/docker-nvidia-egl-desktop/blob/91ba69533d707cf21933cf30097491315d7805cf/Dockerfile#L296 and https://github.com/selkies-project/docker-nvidia-egl-desktop/blob/91ba69533d707cf21933cf30097491315d7805cf/entrypoint.sh#L98.

You may directly use my container implementation, which has an option to use KasmVNC as a flag, if you wish.

ehfd avatar Feb 16 '25 07:02 ehfd

I do not have dri interface available... so I use this to get into a console with offloading.

#!/usr/bin/bash

env __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only VK_DRIVER_FILES=/usr/share/vulkan/icd.d/nvidia_icd.json LIBVA_DRIVER_NAME=nvidia bash You can create the json file like this:

cat /usr/share/vulkan/icd.d/virtio_icd.x86_64.json | jq '.ICD.library_path="/usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0"' > /usr/share/vulkan/icd.d/nvidia_icd.json VK_ variables and file are only required if you need vulkan

Works for me for openSCAD, blender, minetest ... except for the last one I have some other nasty mouse issues (#310)

@hochbit Does this with with actual X11 windows, or only with offscreen workloads?

ehfd avatar Feb 16 '25 07:02 ehfd

Relevant: https://github.com/TigerVNC/tigervnc/issues/1773

ehfd avatar Feb 16 '25 08:02 ehfd

@ehfd With X11 - using it for UE5Editor (combined with my "hack" to have the mouse pointer in game mode)

I now changed my /kclient/startwm.sh - the nvidia-smi check:


# Enable Nvidia GPU support if detected
if which nvidia-smi; then
    export __NV_PRIME_RENDER_OFFLOAD=1
    export __GLX_VENDOR_LIBRARY_NAME=nvidia
    export __VK_LAYER_NV_optimus=NVIDIA_only
    export VK_DRIVER_FILES=/usr/share/vulkan/icd.d/nvidia_icd.json
    export LIBVA_DRIVER_NAME=nvidia
fi

...

Now all applications started from the desktop use the GPU. The container runs in a Pod in Kubernetes. It also works locally for me with docker run ... --gpus all ...

hochbit avatar Mar 18 '25 14:03 hochbit

When you run our containers within Kasm Workspaces, we take care of the docker run configuration auto-magically. If you are running our containers manually at the CLI, then you are responsible for ensuring everything is correct, and since everyone's system is different, it is really hard to provide guidance for manually running CLI commands that just works for everyone.

I don't run these containers often manually with GPU acceleration, but looking at the code for Kasm Workspaces, I can see that we not only set nvidia as the runtime and some environmental variables, but we also pass through the DRI card and renderd, which are typically located at /dev/dri/card0 and /dev/dri/renderD128 respectively. You may have multple if your system has more than one video card. You need to pass those into the container, in addition, you need to set the environmental variables KASM_EGL_CARD and KASM_RENDERD with the values set to the locations just mentioned. Kasm Workspaces was designed to support systems with multiple nvidia GPUs, mostly for enterprise grade data center servers. So Kasm may decide to pass card0 into container A and card1 into container B. All that being said, most poeple would need to add the following stuff to your docker run command, in order for the container to work as designed. Again, these containers are intended to be ran inside KAsm Workspaces. We don't do anything that prohibits you from runnign them directly, to be clear...

docker run -it -e KASM_EGL_CARD=/dev/dri/card0 -e KASM_RENDERD=/dev/dri/renderD128 --device /dev/dri/renderD128:/dev/dri/renderD128:rw --device /dev/dri/card0:/dev/dri/card0:rw ....

Additionally, you may run into permissions issues. Normally, when running directly on the host, you just need to add the user to the video and render groups. So you will have to deal with that. The nuclear option is to chmod 1000:1000 /dev/dri/card0 && chmod 1000:1000 /dev/dri/renderD128 on the host. But you can first try to add uid 1000 to groups vendor and render on the host. The container runs as uid 1000.

With the devices mapped into the container and the env variables set, the vnc_startup.sh script will automatically run XFCE with VGL and most apps started within that XFCE session will also be ran with VGL. There are occasionally some apps that require you to explicitly run with vglrun to work correctly.

mmcclaskey avatar Mar 18 '25 17:03 mmcclaskey

@mmcclaskey I know that you use /dev/dri its in all the container README pages - but I do not have a /dev/dri/card0 on the kubernetes cluster, at least when using nvidia operator, there is only /dev/nvidia0 and no /dev/dri inside the container or even the node.

In my image I actually use a linuxserver kasm:ubuntujammy as base, with a copy of the setup of the linuxserver kde image (need ubuntu:22.04 not 24.04 -> recommended for UnrealEditor) and nvidias/cudagl image packages, but with latest package versions (cudagl contained both gl libs and cuda, it is discontinued by nvidia since cuda 11.8).

Seems to work fine for me at the moment. (works so far with kde, glxgears, blender, openscad, ue5editor, ue4editor and minetest)

hochbit avatar Mar 18 '25 18:03 hochbit