devpod icon indicating copy to clipboard operation
devpod copied to clipboard

Specify pod resources (memory, CPU, GPU) when creating a workspace on K8s

Open dfdx opened this issue 1 year ago • 1 comments

Is your feature request related to a problem?

No

Which solution do you suggest?

Add CPU, GPU and memory to resources and limits section.

Also, in case of GPU:

  • add nodepool name to the node_selector
  • if nodepool has taints, add corresponding toleration to the pod spec

Here's the code we use internally in a similar tool to optionally add GPU node:

def maybe_add_gpu(pod_dict: dict, gpu: Optional[str]):
    """
    pod_dict - pod spec represented as Python dict
    gpu - GPU name, e.g. "a100"; the nodepool name is then specified as f"gpu{gpu.lower()}np", e.g. gpua100np
    """
    if not gpu:
        return
    spec = pod_dict["spec"]
    container = spec["containers"][0]
    requests = container["resources"]["requests"]
    limits = container["resources"]["limits"]
    # resources
    requests["nvidia.com/gpu"] = 1
    limits["nvidia.com/gpu"] = 1
    # node_selector
    if "node_selector" not in spec:
        spec["node_selector"] = {}
    spec["node_selector"]["tlk-pool-name"] = f"gpu{gpu.lower()}np" 
    # taint
    # taint_value = f"gpu{gpu.lower()}np"
    if "tolerations" not in spec:
        spec["tolerations"] = []
    spec["tolerations"].append(
        {
            "key": "sku",
            "operator": "Equal",
            "value": "gpu",
            "effect": "NoSchedule",
        }
    )

Which alternative solutions exist?

Additional context

dfdx avatar Jun 21 '23 07:06 dfdx