bottlerocket Nvidia container-runtime API for GPU allocation

Nvidia container-runtime API for GPU allocation

Open ytsssun opened this issue 8 months ago • 0 comments

Co-authored-by: Monirul Islam Revives: https://github.com/bottlerocket-os/bottlerocket/pull/3994

Description of changes: This PR will expose two new APIs that will allow customer to configure value of accept-nvidia-visible-devices-as-volume-mounts and accept-nvidia-visible-devices-envvar-when-unprivileged for nvidia container runtime.

We introduce the default behavior to inject Nvidia GPUs using volume-mounts(https://github.com/bottlerocket-os/bottlerocket/pull/3718). This PR is to allow the users to opt-in to the previous behavior that allows unprivileged pods to have access to all GPUs when NVIDIA_VISIBLE_DEVICES=all is enabled and make both behavior configurable.

Bottlerocket Settings	Impact	Value	What it means?
`settings.kubernetes.nvidia.container-runtime.visible-devices-as-volume-mounts`	allows to change the `accept-nvidia-visible-devices-as-volume-mounts` value for k8s container-toolkit	`true` \| `false` default: `true`	Adjusting the `visible-devices-as-volume-mounts` settings will alters the method of GPU detection and integration within container environments. Setting this parameter to `true` enables the NVIDIA runtime to recognize GPU devices listed in the `NVIDIA_VISIBLE_DEVICES` environment variable and mount them as volumes, which permits applications within the container to interact with and leverage the GPUs as if they were local resources.
`settings.kubernetes.nvidia.container-runtime.visible-devices-envvar-when-unprivileged`	allows to set value of `accept-nvidia-visible-devices-envvar-when-unprivileged` settings of nvidia container runtime for k8s varient	`true` \| `false` default: `false`	When this setting is set to `false`, it prevents unprivileged containers from accessing all GPU devices on the host by default. If `NVIDIA_VISIBLE_DEVICES` is set to `all` within the container images and `visible-devices-envvar-when-unprivileged` is set to true, all GPUs on the host will be accessible to the containers, regardless of the limits set via nvidia.com/gpu. This could lead to situations where more GPUs are allocated to a pod than intended, which can affect resource scheduling and isolation.

Testing done:

[x] Functional Test

Built an AMI for nvidia variant. Verify the settings gets picked up with default value.

$ apiclient get settings.kubernetes.nvidia.container-runtime
{
  "settings": {
    "kubernetes": {
      "nvidia": {
        "container-runtime": {
          "visible-devices-as-volume-mounts": true,
          "visible-devices-envvar-when-unprivileged": false
        }
      }
    }
  }
}

Opt-in the previous behavior to allow unprivileged nvidia device access.

$ apiclient set settings.kubernetes.nvidia.container-runtime.visible-devices-as-volume-mounts=false
$ apiclient set settings.kubernetes.nvidia.container-runtime.visible-devices-envvar-when-unprivileged=true
$ apiclient get settings.kubernetes.nvidia.container-runtime
{
  "settings": {
    "kubernetes": {
      "nvidia": {
        "container-runtime": {
          "visible-devices-as-volume-mounts": false,
          "visible-devices-envvar-when-unprivileged": true
        }
      }
    }
  }
}

Verify the nvidia-container-runtime config exists

$ cat /etc/nvidia-container-runtime/config.toml
accept-nvidia-visible-devices-as-volume-mounts = true
accept-nvidia-visible-devices-envvar-when-unprivileged = false

[nvidia-container-cli]
root = "/"
path = "/usr/bin/nvidia-container-cli"
environment = []
ldconfig = "@/sbin/ldconfig"

[x] Migration Test Tested migration from 1.20.1 to new version. Tested migration back to 1.20.1.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

Jun 12 '24 00:06 ytsssun

bottlerocket bottlerocket copied to clipboard

Nvidia container-runtime API for GPU allocation

bottlerocket
bottlerocket copied to clipboard