k3d icon indicating copy to clipboard operation
k3d copied to clipboard

How to use GPU with k3d

Open arikmaor opened this issue 2 years ago • 20 comments

I've just managed to get k3d running with gpu support and it took a lot of effort getting this to work. The documentation was not updated for along time and most of the information is scattered around many PRs, Issues and medium articles.

I'm gonna describe what I did and I hope you can update the docs.

What should be installed on the host?

Based on the nvidia installation guide

Nvidia Drivers

sudo apt update
sudo apt install nvidia-driver-450 # I guess some other will work as well

nvidia-runtime package

There is a lot of confusion on that part as there are many similar packages You want nvidia-docker2

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
         
sudo apt install -y nvidia-docker2

sudo systemctl restart docker # restart docker

Check that the host is configured correctly

sudo docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

if you see the nvidia-smi output with your graphics card name then it is configured correctly!

The custom k3s image

The custom image in k3d current documentation requires the following tweaks:

  1. Instead of COPY --from=k3s / /, do COPY --from=k3s /bin /bin
  2. Also copy etc dir: COPY --from=k3s /etc /etc (I'm not sure this is a must)
  3. The CRI environment variable is missing: ENV CRI_CONFIG_FILE=/var/lib/rancher/k3s/agent/etc/crictl.yaml
  4. The config.toml.tmpl file that is widely suggested causes an error for all the pods that is related to cgroups, I've managed to solve this by creating a new file based on the original template with a simple addition of default_runtime_name = "nvidia" under [plugins.cri.containerd]

Docker file:

ARG K3S_TAG="v1.26.4-k3s1"
FROM rancher/k3s:$K3S_TAG as k3s

FROM nvidia/cuda:11.8.0-base-ubuntu22.04

ARG NVIDIA_CONTAINER_RUNTIME_VERSION
ENV NVIDIA_CONTAINER_RUNTIME_VERSION=$NVIDIA_CONTAINER_RUNTIME_VERSION

RUN apt-get update && \
    apt-get -y install gnupg2 curl nvidia-container-runtime=${NVIDIA_CONTAINER_RUNTIME_VERSION} && \
    chmod 1777 /tmp && \
    mkdir -p /var/lib/rancher/k3s/agent/etc/containerd && \
    mkdir -p /var/lib/rancher/k3s/server/manifests

COPY --from=k3s /bin /bin
COPY --from=k3s /etc /etc

# Provide custom containerd configuration to configure the nvidia-container-runtime
COPY config.toml.tmpl /var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl

# Deploy the nvidia driver plugin on startup
COPY device-plugin-daemonset.yaml /var/lib/rancher/k3s/server/manifests/nvidia-device-plugin-daemonset.yaml

VOLUME /var/lib/kubelet
VOLUME /var/lib/rancher/k3s
VOLUME /var/lib/cni
VOLUME /var/log

ENV PATH="$PATH:/bin/aux"
ENV CRI_CONFIG_FILE=/var/lib/rancher/k3s/agent/etc/crictl.yaml

ENTRYPOINT ["/bin/k3s"]
CMD ["agent"]

config.toml.tmpl

version = 2

[plugins."io.containerd.internal.v1.opt"]
  path = "{{ .NodeConfig.Containerd.Opt }}"
[plugins."io.containerd.grpc.v1.cri"]
  stream_server_address = "127.0.0.1"
  stream_server_port = "10010"
  enable_selinux = {{ .NodeConfig.SELinux }}
  enable_unprivileged_ports = {{ .EnableUnprivileged }}
  enable_unprivileged_icmp = {{ .EnableUnprivileged }}

{{- if .DisableCgroup}}
  disable_cgroup = true
{{end}}
{{- if .IsRunningInUserNS }}
  disable_apparmor = true
  restrict_oom_score_adj = true
{{end}}

{{- if .NodeConfig.AgentConfig.PauseImage }}
  sandbox_image = "{{ .NodeConfig.AgentConfig.PauseImage }}"
{{end}}

{{- if .NodeConfig.AgentConfig.Snapshotter }}
[plugins."io.containerd.grpc.v1.cri".containerd]
  default_runtime_name = "nvidia"
  snapshotter = "{{ .NodeConfig.AgentConfig.Snapshotter }}"
  disable_snapshot_annotations = {{ if eq .NodeConfig.AgentConfig.Snapshotter "stargz" }}false{{else}}true{{end}}
{{ if eq .NodeConfig.AgentConfig.Snapshotter "stargz" }}
{{ if .NodeConfig.AgentConfig.ImageServiceSocket }}
[plugins."io.containerd.snapshotter.v1.stargz"]
cri_keychain_image_service_path = "{{ .NodeConfig.AgentConfig.ImageServiceSocket }}"
[plugins."io.containerd.snapshotter.v1.stargz".cri_keychain]
enable_keychain = true
{{end}}
{{ if .PrivateRegistryConfig }}
{{ if .PrivateRegistryConfig.Mirrors }}
[plugins."io.containerd.snapshotter.v1.stargz".registry.mirrors]{{end}}
{{range $k, $v := .PrivateRegistryConfig.Mirrors }}
[plugins."io.containerd.snapshotter.v1.stargz".registry.mirrors."{{$k}}"]
  endpoint = [{{range $i, $j := $v.Endpoints}}{{if $i}}, {{end}}{{printf "%q" .}}{{end}}]
{{if $v.Rewrites}}
  [plugins."io.containerd.snapshotter.v1.stargz".registry.mirrors."{{$k}}".rewrite]
{{range $pattern, $replace := $v.Rewrites}}
    "{{$pattern}}" = "{{$replace}}"
{{end}}
{{end}}
{{end}}
{{range $k, $v := .PrivateRegistryConfig.Configs }}
{{ if $v.Auth }}
[plugins."io.containerd.snapshotter.v1.stargz".registry.configs."{{$k}}".auth]
  {{ if $v.Auth.Username }}username = {{ printf "%q" $v.Auth.Username }}{{end}}
  {{ if $v.Auth.Password }}password = {{ printf "%q" $v.Auth.Password }}{{end}}
  {{ if $v.Auth.Auth }}auth = {{ printf "%q" $v.Auth.Auth }}{{end}}
  {{ if $v.Auth.IdentityToken }}identitytoken = {{ printf "%q" $v.Auth.IdentityToken }}{{end}}
{{end}}
{{ if $v.TLS }}
[plugins."io.containerd.snapshotter.v1.stargz".registry.configs."{{$k}}".tls]
  {{ if $v.TLS.CAFile }}ca_file = "{{ $v.TLS.CAFile }}"{{end}}
  {{ if $v.TLS.CertFile }}cert_file = "{{ $v.TLS.CertFile }}"{{end}}
  {{ if $v.TLS.KeyFile }}key_file = "{{ $v.TLS.KeyFile }}"{{end}}
  {{ if $v.TLS.InsecureSkipVerify }}insecure_skip_verify = true{{end}}
{{end}}
{{end}}
{{end}}
{{end}}
{{end}}

{{- if not .NodeConfig.NoFlannel }}
[plugins."io.containerd.grpc.v1.cri".cni]
  bin_dir = "{{ .NodeConfig.AgentConfig.CNIBinDir }}"
  conf_dir = "{{ .NodeConfig.AgentConfig.CNIConfDir }}"
{{end}}

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  runtime_type = "io.containerd.runc.v2"

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
  SystemdCgroup = {{ .SystemdCgroup }}

{{ if .PrivateRegistryConfig }}
{{ if .PrivateRegistryConfig.Mirrors }}
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]{{end}}
{{range $k, $v := .PrivateRegistryConfig.Mirrors }}
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."{{$k}}"]
  endpoint = [{{range $i, $j := $v.Endpoints}}{{if $i}}, {{end}}{{printf "%q" .}}{{end}}]
{{if $v.Rewrites}}
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."{{$k}}".rewrite]
{{range $pattern, $replace := $v.Rewrites}}
    "{{$pattern}}" = "{{$replace}}"
{{end}}
{{end}}
{{end}}

{{range $k, $v := .PrivateRegistryConfig.Configs }}
{{ if $v.Auth }}
[plugins."io.containerd.grpc.v1.cri".registry.configs."{{$k}}".auth]
  {{ if $v.Auth.Username }}username = {{ printf "%q" $v.Auth.Username }}{{end}}
  {{ if $v.Auth.Password }}password = {{ printf "%q" $v.Auth.Password }}{{end}}
  {{ if $v.Auth.Auth }}auth = {{ printf "%q" $v.Auth.Auth }}{{end}}
  {{ if $v.Auth.IdentityToken }}identitytoken = {{ printf "%q" $v.Auth.IdentityToken }}{{end}}
{{end}}
{{ if $v.TLS }}
[plugins."io.containerd.grpc.v1.cri".registry.configs."{{$k}}".tls]
  {{ if $v.TLS.CAFile }}ca_file = "{{ $v.TLS.CAFile }}"{{end}}
  {{ if $v.TLS.CertFile }}cert_file = "{{ $v.TLS.CertFile }}"{{end}}
  {{ if $v.TLS.KeyFile }}key_file = "{{ $v.TLS.KeyFile }}"{{end}}
  {{ if $v.TLS.InsecureSkipVerify }}insecure_skip_verify = true{{end}}
{{end}}
{{end}}
{{end}}

{{range $k, $v := .ExtraRuntimes}}
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."{{$k}}"]
  runtime_type = "{{$v.RuntimeType}}"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."{{$k}}".options]
  BinaryName = "{{$v.BinaryName}}"
{{end}}

arikmaor avatar Jul 23 '22 00:07 arikmaor

First of all, thanks for sharing this. It worked for me, I also tested it we the last version of k3s: v1.24.4-k3s1

I didn't explore the differences in detail but the manifest device-plugin-daemonset.yaml in the k3d docs seems outdated compared to the nvidia's github:

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.12.2/nvidia-device-plugin.yml

nuxion avatar Sep 04 '22 22:09 nuxion

Thanks @arikmaor for posting this. Original config.toml.tmpl from docs was not working for me and all pods were failing with

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to start shim: exec: "containerd-shim": executable file not found in $PATH: unknown

Your config fixed it! Can I ask what is the source of your config.toml.tmpl ?

david-suba avatar Nov 04 '22 09:11 david-suba

I have been having issues with the nvidia-device-plugin crashing for me. Any tips for seeing why it crashed, I am having a hard time finding output. This is also happening on all the machines I have tested (2 so far).

Z02X avatar Nov 05 '22 15:11 Z02X

Thanks @arikmaor for posting this. Original config.toml.tmpl from docs was not working for me and all pods were failing with

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to start shim: exec: "containerd-shim": executable file not found in $PATH: unknown

Your config fixed it! Can I ask what is the source of your config.toml.tmpl ?

The config.toml.tmpl file is based on the original template and the only change is adding:

[plugins.cri.containerd]
  default_runtime_name = "nvidia"

arikmaor avatar Nov 06 '22 11:11 arikmaor

I have been having issues with the nvidia-device-plugin crashing for me. Any tips for seeing why it crashed, I am having a hard time finding output. This is also happening on all the machines I have tested (2 so far).

  1. Use kubectl logs -n kube-system ds/nvidia-device-plugin-daemonset to see the logs
  2. Run docker exec -it k3d-YOUR_CLUSTER_NAME-server-0 nvidia-smi and check that you see you're graphics card in the output. If not, the problem is probably on the host or in docker configuration and not in nvidia-device-plugin.

arikmaor avatar Nov 06 '22 11:11 arikmaor

Thanks @arikmaor for posting this. Original config.toml.tmpl from docs was not working for me and all pods were failing with

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to start shim: exec: "containerd-shim": executable file not found in $PATH: unknown

Your config fixed it! Can I ask what is the source of your config.toml.tmpl ?

The config.toml.tmpl file is based on the original template and the only change is adding:

[plugins.cri.containerd]
  default_runtime_name = "nvidia"

Adding the line seems to cause the following:

Waiting for containerd startup: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService

I tried to search for some related topics --- anyone else encounters similar?


Okey I find it: https://github.com/k3d-io/k3d/issues/658#issuecomment-1130551511

Make sure you have every versions aligned. :)

NeverBehave avatar Jan 27 '23 00:01 NeverBehave

Thanks arikmaor, your comment was super helpful. Confirmed to work with the latest version of K3s: v1.26.1-k3s1. Created a small repo for my own bootstrapping purposes, feel free to use if it's helpful: https://github.com/lajd/k3d_bootstrap

lajd avatar Jan 27 '23 23:01 lajd

Thank you @arikmaor! I succeeded only after applying your modifications. My setup:

$ k3d --version
k3d version v5.4.7
k3s version v1.25.6-k3s1 (default)

$ docker -v
Docker version 20.10.12, build 20.10.12-0ubuntu4

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.2 LTS
Release:	22.04
Codename:	jammy

Dockerfile
ARG K3S_TAG="v1.25.6-k3s1"
FROM nvidia/cuda:12.0.1-base-ubuntu22.04

infroger avatar Feb 28 '23 18:02 infroger

@all-contributors please add @arikmaor for tutorial and example

arikmaor avatar Mar 08 '23 23:03 arikmaor

@arikmaor

I've updated the pull request to add @arikmaor! :tada:

allcontributors[bot] avatar Mar 08 '23 23:03 allcontributors[bot]

@arikmaor I notice that you've put the default_runtime_name under the if .NodeConfig.AgentConfig.Snapshotter conditional, is this intentional?

rassie avatar Mar 21 '23 15:03 rassie

@arikmaor I notice that you've put the default_runtime_name under the if .NodeConfig.AgentConfig.Snapshotter conditional, is this intentional?

Take a look at the original template, the if is already there.

...
{{- if .NodeConfig.AgentConfig.Snapshotter }}
[plugins.cri.containerd]
  snapshotter = "{{ .NodeConfig.AgentConfig.Snapshotter }}"
...

I simply added default_runtime_name = "nvidia" under plugins.cri.containerd as this seems to be the place to put this setting and I did not go over the template logic.

arikmaor avatar Mar 22 '23 00:03 arikmaor

It is worth noting that the original template file has changed, and I'm not sure of the implications. It could mean that newer versions of k3s will require a bit different config.toml.tmpl file, based on the latest version

I believe now we need to set default_runtime_name = "nvidia" under [plugins."io.containerd.grpc.v1.cri".containerd] but I haven't had time to test it

arikmaor avatar Mar 22 '23 00:03 arikmaor

Yeah, this situation is honestly a bit maddening, since overwriting this whole configuration file just for a couple of lines while there is some amount of auto-configuration already happening feels a bit overkill.

However, I've become interested in k3s' automatic NVIDIA runtime support. Current config.toml automatically adds configuration for .ExtraRuntimes, which nvidia also falls into. While setting default_runtime_name requires overriding the file, we could follow the ~easier~ more explicit way from the k3s docs by adding a RuntimeClass and setting runtimeClassName: nvidia on the nvidia-device-plugin DaemonSet and on all the pods which need the GPU.

I've tested this setup with the usual vector-add pod and it works for me. I've basically added two mounts to my cluster, but including these files directly in the image would work as well, no extra configuration required:

  - volume: ${PWD}/nvidia-device-plugin.yaml:/var/lib/rancher/k3s/server/manifests/nvidia-device-plugin.yaml
    nodeFilters:
      - all
  - volume: ${PWD}/nvidia-runtime-class.yaml:/var/lib/rancher/k3s/server/manifests/nvidia-runtime-class.yaml
    nodeFilters:
      - all

Should we try to consolidate all of this information into a PR?

rassie avatar Mar 22 '23 08:03 rassie

Yeah, this situation is honestly a bit maddening, since overwriting this whole configuration file just for a couple of lines while there is some amount of auto-configuration already happening feels a bit overkill.

I totally agree

My approach to getting GPU workloads working was based on the guide in k3d documentation. When it didn't just work, I found a way to make it work with just a few tweaks and described it in this issue. What you suggest has a bit more implications so I think @iwilltry42 should be involved in the discussion.

I like your idea because right now, when k3s updates, it could require changes to the custom template, so upgrading becomes hard. Your solution solves this problem IIUC and it's awesome.

The problem I see with this solution is that setting runtimeClassName: nvidia is not trivial IMO, it's not something you usually do in order to use the GPU. The runtimeClassName property is new to me In all the cloud environments AFAIK the flow to use GPU is:

  1. Create a node that has a GPU
  2. Install the nvidia-device-plugin
  3. Request nvidia.com/gpu: 1 on the workloads that needs to use the GPU

arikmaor avatar Mar 22 '23 11:03 arikmaor

I'm still unable to make any of this work. It would be so much simpler if someone was kind enough to push the custom build k3d docker image that supports GPUs :(

EKami avatar Jun 02 '23 01:06 EKami

On a somewhat connected note: the docs mention that this whole GPU configuration does not work on WSL2. Currently it's still somewhat true, but the solution is on the horizon: there is an open merge request on the nvidia-device-plugin repository, which in my local testing indeed solves the problem. All that's needed is to build the current 1.4 version of nvidia-device-plugin with this patch applied on top (which is very simply done, since their whole build process is dockerized), push the image somewhere and update the image reference in the Kubernetes manifest for nvidia-device-plugin.

rassie avatar Jul 01 '23 19:07 rassie

I've upgraded the original comment (and tested)

  1. Base image was set to nvidia/cuda:11.8.0-base-ubuntu22.04 (before it was ubuntu 18)
  2. k3s base image was upgraded to v1.26.4-k3s1 (before it was version 1.23.8)
  3. containerd config file template was upgraded to the new version (the default embedded template was upgraded)

arikmaor avatar Jul 12 '23 09:07 arikmaor

FYI: the nvidia-container-runtime project has been superseded by the NVIDIA Container Toolkit.

the NVIDIA_CONTAINER_RUNTIME_VERSION parameter become obsolete

2qif49lt avatar Dec 04 '23 03:12 2qif49lt

Hey all, I made an attempt to replicate what @arikmaor had done and I have it working on our Lambda cloud instance. Also took the opportunity to provide more updated nvidia/cuda, nvidia device plugin, and k3s deps, and created a better build script. The Dockerfile has some notable changes to improve the approach, and its bugs, a little bit. The image is located below and should work for linux/amd64 and linux/arm64. Just pull the image down and bootstrap your own k3d cluster with it. You can also feel free to deploy my super simplistic gpu-support-test as well, but that will require an OS K8s tool called Zarf. My next attempt will be to put Whisper and an API in front of it within the Cluster to see if it truly is all working E2E beyond my gpu-support-test.

k3d GPU Support image: https://github.com/justinthelaw/k3d-gpu-support/pkgs/container/k3d-gpu-support Repo with the source: https://github.com/justinthelaw/k3d-gpu-support

Please let me know if there are any fixes or issues. Would love some feedback and also support for other future use cases (like ROCm, Metal, etc.) involving k3d. I am still new to all of this.

Important note: the CUDA version must match your host system or containerized NVIDIA drivers. E.g., our Lambda instance has 535 installed, with max CUDA of 12.2.x, so this image has the base image set to 12.2.0 so that it is compatible.

justinthelaw avatar Jan 11 '24 14:01 justinthelaw

Took a stab at gathering all the info in this thread and submitted a PR with updated CUDA documentation. I took the suggestion from the k3s docs (also mentioned here by @rassie) and added a RuntimeClass definition and runtimeClassName: nvidia to the Pod specs. This should be more robust than the old method since you no longer need a custom config.toml.

I did not run into any issues on my setup, but I only did some basic testing, so feedback is appreciated. It appears to work fine in WSL2 as well. Updated files available in this repo: https://github.com/dbreyfogle/k3d-nvidia-runtime

dbreyfogle avatar Apr 15 '24 04:04 dbreyfogle

Docs updated in https://github.com/k3d-io/k3d/pull/1430 - thanks @dbreyfogle and everyone in this thread for providing the information! Please test the new docs and give feedback, if possible :+1:

iwilltry42 avatar Apr 15 '24 05:04 iwilltry42