bottlerocket
bottlerocket copied to clipboard
Nvidia variants for bare metal
What I'd like: I would like to use bottle rocket on premise on machines with NVIDIA GPUs. Therefore, I'd like a variant containing Nvidia drivers for bare metal.
Any alternatives you've considered: Support or instructions for building the nvidia-driver docker for Bottlerocket.
Hello @wokalski! Thanks for cutting this issue! Have you looked at how we build our NVIDIA AWS variants? https://github.com/bottlerocket-os/bottlerocket/blob/develop/variants/aws-k8s-1.28-nvidia/Cargo.toml
I haven't tried this since I don't have bare metal NVIDIA hardware to try it on, but building that custom variant may work without much more difficulty. The key piece is to add the packages for NVIDIA:
"nvidia-container-toolkit",
"nvidia-k8s-device-plugin",
"kmod-6.1-nvidia-tesla-535",
along with their dependencies. You may also need to adjust the size of the image https://github.com/bottlerocket-os/bottlerocket/blob/develop/variants/aws-k8s-1.28-nvidia/Cargo.toml#L13 to fit the drivers.
Can you see if this works for you?
I have seen it but I didn't dig deeper so I didn't know if those weren't somehow aws/VM specific.
I'll do my best to check it out and report back!
I took a bit of time to try this out and can confirm the images build just fine for metal:
$ git diff
diff --git a/variants/metal-k8s-1.28/Cargo.toml b/variants/metal-k8s-1.28/Cargo.toml
index d299e025..b7d50ced 100644
--- a/variants/metal-k8s-1.28/Cargo.toml
+++ b/variants/metal-k8s-1.28/Cargo.toml
@@ -36,6 +36,10 @@ included-packages = [
"cni",
"cni-plugins",
"kubelet-1.28",
+ # nvidia
+ "nvidia-container-toolkit",
+ "nvidia-k8s-device-plugin",
+ "kmod-6.1-nvidia-tesla-535",
]
[lib]
@@ -50,3 +54,7 @@ aws-iam-authenticator = { path = "../../packages/aws-iam-authenticator" }
cni = { path = "../../packages/cni" }
cni-plugins = { path = "../../packages/cni-plugins" }
kubernetes-1_28 = { path = "../../packages/kubernetes-1.28" }
+# nvidia
+nvidia-container-toolkit = { path = "../../packages/nvidia-container-toolkit" }
+nvidia-k8s-device-plugin = { path = "../../packages/nvidia-k8s-device-plugin" }
+kmod-6_1-nvidia = { path = "../../packages/kmod-6.1-nvidia" }
I don't have hardware that can prove this works but the resulting image did have all the drivers and additional toolkits/plugins and they did come up and attempt to find the hardware (for which I don't have).