Run a container(ubuntu 20.04) with GPU accleration, the container use OpenGL renderer: llvmpipe (LLVM 15.0.7, 256 bits) but not NVIDIA GeForce RTX 3060/PCIe/SSE2
I use sudo docker run -it -p 6901:6901 -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=all --gpus="all" --name nvidia-test -e VNC_PW=123456 kasm-lrae nvidia-smi to create a container
I hope the container can use my Nvidia GPU to render gazebo which requires OpenGL. But I use the glxinfo -B to check the container, it show the follow response:
It seems that the container hasn't use GPU to render the OpenGL. The expected output may be this(according KasmVNC GPU Acceleration ):
The
nvidia-smi both give the expected outcome in host system and container, which may indicate that I have installed Nvidia driver successfully.
Other information: host system: Ubuntu20.04 The image kasm-lrae base bases from kasmweb/ubuntu-focal-desktop:1.16.0-rolling-daily The content of daemon.json is as follow: { "registry-mirrors": [ "https://reg-mirror.qiniu.com/" ], "runtimes": { "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } }, "default-runtime": "nvidia" }
Same!! Expect your solution
I do not have dri interface available... so I use this to get into a console with offloading.
#!/usr/bin/bash
env __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only VK_DRIVER_FILES=/usr/share/vulkan/icd.d/nvidia_icd.json LIBVA_DRIVER_NAME=nvidia bash
You can create the json file like this:
cat /usr/share/vulkan/icd.d/virtio_icd.x86_64.json | jq '.ICD.library_path="/usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0"' > /usr/share/vulkan/icd.d/nvidia_icd.json
VK_ variables and file are only required if you need vulkan
Works for me for openSCAD, blender, minetest ... except for the last one I have some other nasty mouse issues (#310)
@Maipengfei @codernew007
I suggest combining VirtualGL to make things work. Check https://github.com/selkies-project/docker-nvidia-egl-desktop/blob/91ba69533d707cf21933cf30097491315d7805cf/Dockerfile#L296 and https://github.com/selkies-project/docker-nvidia-egl-desktop/blob/91ba69533d707cf21933cf30097491315d7805cf/entrypoint.sh#L98.
You may directly use my container implementation, which has an option to use KasmVNC as a flag, if you wish.
I do not have dri interface available... so I use this to get into a console with offloading.
#!/usr/bin/bash
env __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only VK_DRIVER_FILES=/usr/share/vulkan/icd.d/nvidia_icd.json LIBVA_DRIVER_NAME=nvidia bash You can create the json file like this:
cat /usr/share/vulkan/icd.d/virtio_icd.x86_64.json | jq '.ICD.library_path="/usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0"' > /usr/share/vulkan/icd.d/nvidia_icd.json VK_ variables and file are only required if you need vulkan
Works for me for openSCAD, blender, minetest ... except for the last one I have some other nasty mouse issues (#310)
@hochbit Does this with with actual X11 windows, or only with offscreen workloads?
Relevant: https://github.com/TigerVNC/tigervnc/issues/1773
@ehfd With X11 - using it for UE5Editor (combined with my "hack" to have the mouse pointer in game mode)
I now changed my /kclient/startwm.sh - the nvidia-smi check:
# Enable Nvidia GPU support if detected
if which nvidia-smi; then
export __NV_PRIME_RENDER_OFFLOAD=1
export __GLX_VENDOR_LIBRARY_NAME=nvidia
export __VK_LAYER_NV_optimus=NVIDIA_only
export VK_DRIVER_FILES=/usr/share/vulkan/icd.d/nvidia_icd.json
export LIBVA_DRIVER_NAME=nvidia
fi
...
Now all applications started from the desktop use the GPU. The container runs in a Pod in Kubernetes.
It also works locally for me with docker run ... --gpus all ...
When you run our containers within Kasm Workspaces, we take care of the docker run configuration auto-magically. If you are running our containers manually at the CLI, then you are responsible for ensuring everything is correct, and since everyone's system is different, it is really hard to provide guidance for manually running CLI commands that just works for everyone.
I don't run these containers often manually with GPU acceleration, but looking at the code for Kasm Workspaces, I can see that we not only set nvidia as the runtime and some environmental variables, but we also pass through the DRI card and renderd, which are typically located at /dev/dri/card0 and /dev/dri/renderD128 respectively. You may have multple if your system has more than one video card. You need to pass those into the container, in addition, you need to set the environmental variables KASM_EGL_CARD and KASM_RENDERD with the values set to the locations just mentioned. Kasm Workspaces was designed to support systems with multiple nvidia GPUs, mostly for enterprise grade data center servers. So Kasm may decide to pass card0 into container A and card1 into container B. All that being said, most poeple would need to add the following stuff to your docker run command, in order for the container to work as designed. Again, these containers are intended to be ran inside KAsm Workspaces. We don't do anything that prohibits you from runnign them directly, to be clear...
docker run -it -e KASM_EGL_CARD=/dev/dri/card0 -e KASM_RENDERD=/dev/dri/renderD128 --device /dev/dri/renderD128:/dev/dri/renderD128:rw --device /dev/dri/card0:/dev/dri/card0:rw ....
Additionally, you may run into permissions issues. Normally, when running directly on the host, you just need to add the user to the video and render groups. So you will have to deal with that. The nuclear option is to chmod 1000:1000 /dev/dri/card0 && chmod 1000:1000 /dev/dri/renderD128 on the host. But you can first try to add uid 1000 to groups vendor and render on the host. The container runs as uid 1000.
With the devices mapped into the container and the env variables set, the vnc_startup.sh script will automatically run XFCE with VGL and most apps started within that XFCE session will also be ran with VGL. There are occasionally some apps that require you to explicitly run with vglrun to work correctly.
@mmcclaskey I know that you use /dev/dri its in all the container README pages - but I do not have a /dev/dri/card0 on the kubernetes cluster, at least when using nvidia operator, there is only /dev/nvidia0 and no /dev/dri inside the container or even the node.
In my image I actually use a linuxserver kasm:ubuntujammy as base, with a copy of the setup of the linuxserver kde image (need ubuntu:22.04 not 24.04 -> recommended for UnrealEditor) and nvidias/cudagl image packages, but with latest package versions (cudagl contained both gl libs and cuda, it is discontinued by nvidia since cuda 11.8).
Seems to work fine for me at the moment. (works so far with kde, glxgears, blender, openscad, ue5editor, ue4editor and minetest)