garden icon indicating copy to clipboard operation
garden copied to clipboard

[FEATURE]: Separate configuration for garden-util pods

Open m-ackley opened this issue 2 years ago • 1 comments

Feature Request

Background / Motivation

Currently, garden-util and kaniko-build pods share the same tolerations, which means they cannot be scheduled to run on different nodes.

Additionally, there is no way to configure the resource request/limit for the garden-util pods and they will use whatever the default values are for the Kubernetes cluster.

See this forum thread for additional discussion.

What should the user be able to do?

The user should be able to specify separate tolerations and resource requests/limits for garden-util pods in the Kubernetes provider config. The resource request/limits can also have a reasonable default (some testing under load might be worthwhile for the ideal amount).

Why do they want to do this? What problem does it solve?

Some teams use a dynamically-scaling nodepool with their cloud provider, and may have additional resources allocated to their build nodes (making them cost more). It would be preferable to be able to limit the pods running on these nodes to the ephemeral build pods and run the garden-util pods elsewhere, and also to configure their resource requests/limits. This will allow users to more effectively manage the infrastructure footprint of their builds and minimize costs.

Suggested Implementation(s)

An additional section in the Kubernetes provider config under the kaniko settings for the garden-util pod resources and tolerations.

Example:

providers:
  resources:
    builder:
      limits:
        cpu:
        memory:
        ephemeralStorage:
      requests:
        cpu:
        memory:
        ephemeralStorage:
    util:
      limits:
        cpu:
        memory:
        ephemeralStorage:
      requests:
        cpu:
        memory:
        ephemeralStorage:
  kaniko:
    tolerations:
      builder:
        - effect:
          key:
          operator:
          value:
      util:
        - effect:
          key:
          operator:
          value:

How important is this feature for you/your team?

🌵 Not having this feature makes using Garden painful

m-ackley avatar Sep 07 '22 22:09 m-ackley

Thanks @m-ackley! This makes sense, and thanks for the detailed context.

We might go with a slightly different config schema (not sure yet) but in principle this is easy enough to implement.

edvald avatar Sep 07 '22 22:09 edvald

I second this feature! That would be really nice.

antoinelyset avatar Oct 12 '22 15:10 antoinelyset

Wanted chime in with a couple of extra points here.

Beyond tolerations it would be good to be able to set annotations to ensure auto scaling works properly.

That would e.g. allow users to set "cluster-autoscaler.kubernetes.io/safe-to-evict": "true".

The same also applies to the cluster-buildkit Pod. It uses emptyDir and therefore can't be evicted by the auto scaler (see this GitHub issue for context).

This was also flagged by our Discord community

(cc @Orzelius, @stefreak)

eysi09 avatar Nov 15 '22 14:11 eysi09

@eysi09 Yes this will be imlemented 👍 This issue is being tracked in https://github.com/garden-io/garden/issues/2628 and https://github.com/garden-io/garden/issues/2931

stefreak avatar Nov 15 '22 15:11 stefreak