traefik-helm-chart icon indicating copy to clipboard operation
traefik-helm-chart copied to clipboard

feat(deployment): allow null and 0 replicas

Open na2na-p opened this issue 5 months ago • 5 comments

This PR resolves #1451.

What does this PR do?

This pull request updates the logic for the deployment.replicas field to support two distinct, important use cases:

  1. Delegating Replica Management: Allows setting replicas: null to completely omit the replicas field from the rendered Deployment manifest.
  2. Scaling to Zero: Ensures that setting replicas: 0 correctly renders replicas: 0 in the manifest.

This is achieved by changing the rendering condition in deployment.yaml from replicas: {{ default 1 .Values.deployment.replicas }} to use an explicit non-nil check: {{- if and (not .Values.autoscaling.enabled) (not (eq .Values.deployment.replicas nil)) }}.

Motivation

The primary motivation is to solve the problem described in #1451, where users need to delegate replica management to external controllers like Argo Rollouts. This requires omitting the replicas field from the chart's output.

Following a discussion in the issue, we opted for a simpler approach of allowing replicas: null instead of introducing a new manageReplicas parameter. This PR implements that idea while also fixing a side effect where replicas: 0 would have been incorrectly ignored (as 0 is falsy in Go templates).

This solution provides the needed flexibility for modern GitOps patterns without adding new parameters and while maintaining backward compatibility.

More

  • [x] Yes, I updated the tests accordingly
  • [ ] Yes, I updated the schema accordingly
  • [x] Yes, I ran make test and all the tests passed

na2na-p avatar Jun 14 '25 05:06 na2na-p

I have tested the following scenarios and confirmed they work as intended:

A standard Deployment managed by an HPA.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: traefik-testing
  namespace: argocd
  labels:
    ManagedBy: CRD
spec:
  destination:
    namespace: traefik-testing
    server: https://kubernetes.default.svc
  source:
    repoURL: https://github.com/na2na-p/traefik-helm-chart.git
    targetRevision: feat/deployment-allow-null-replicas
    path: traefik
    helm:
      parameters:
        - name: deployment.replicas
          value: "null"
      values: |-
        autoscaling:
          enabled: true
          maxReplicas: 50
          minReplicas: 5
  project: default
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
      - ServerSideApply=true

image

Setting deployment.replicas: 0 while using Argo Rollouts.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: traefik-testing-with-rollouts
  namespace: argocd
  labels:
    ManagedBy: CRD
spec:
  destination:
    namespace: traefik-testing-with-rollouts
    server: https://kubernetes.default.svc
  source:
    repoURL: https://github.com/na2na-p/traefik-helm-chart.git
    targetRevision: feat/deployment-allow-null-replicas
    path: traefik
    helm:
      parameters:
        - name: deployment.replicas
          value: "0"
      values: |-
        extraObjects:
          - apiVersion: argoproj.io/v1alpha1
            kind: Rollout
            metadata:
              name: traefik-testing-with-rollouts
            spec:
              workloadRef:
                apiVersion: apps/v1
                kind: Deployment
                name: traefik-testing-with-rollouts
              strategy:
                canary:
                  steps:
                  - setWeight: 10
                  - pause:
                      duration: 5m
          - apiVersion: autoscaling/v2
            kind: HorizontalPodAutoscaler
            metadata:
              name: traefik-rollout-hpa
            spec:
              scaleTargetRef:
                apiVersion: argoproj.io/v1alpha1
                kind: Rollout
                name: traefik-testing-with-rollouts
              minReplicas: 5
              maxReplicas: 50
              metrics:
              - type: Resource
                resource:
                  name: cpu
                  target:
                    type: Utilization
                    averageUtilization: 80
  project: default
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
      - ServerSideApply=true

image

na2na-p avatar Jun 14 '25 06:06 na2na-p

Thanks. It's missing a test when replicas is nullified. Unless I missed something, the HPA can be configured with values, no need to use extraObjects.

Instead of hiding it in this PR, wdyt about adding an EXAMPLE on this use case ?

mloiseleur avatar Jun 16 '25 06:06 mloiseleur

@mloiseleur

Thanks to comments!


It's missing a test when replicas is nullified.

Thanks for pointing that out! I've added the test case.


Unless I missed something, the HPA can be configured with values, no need to use extraObjects.

Good point. Unless I'm mistaken, my understanding is that the apiVersion and kind are hardcoded here and cannot be changed through values: https://github.com/traefik/traefik-helm-chart/blob/8873fd68ec36561b8853fb357848f2423e597fc9/traefik/templates/hpa.yaml#L20-L21

Additionally, this Helm chart is designed for the proxy workload to run as either a Deployment or a DaemonSet, so I believe there's little need to modify them.


Instead of hiding it in this PR, wdyt about adding an EXAMPLE on this use case ?

That's a great suggestion, thank you! I'll work on adding an Install with Argo Rollouts example in this same PR.

I'll start on the modifications now, so please give me a little time.

na2na-p avatar Jun 16 '25 12:06 na2na-p

Unless I'm mistaken, my understanding is that the apiVersion and kind are hardcoded here and cannot be changed through values

\o that's right. I opened #1453 in order to add this feature. Feel free to review and test it.

mloiseleur avatar Jun 16 '25 12:06 mloiseleur

@mloiseleur

I'm planning to add conditional logic based on scaleTargetRef.apiVersion and scaleTargetRef.kind, so I think it would be best for me to rebase after #1453 is merged first.

It worked successfully after I applied this fix based on your code. This works with the configuration below.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: traefik-testing-with-rollouts
  namespace: argocd
spec:
  destination:
    namespace: traefik-testing-with-rollouts
    server: https://kubernetes.default.svc
  source:
    repoURL: https://github.com/na2na-p/traefik-helm-chart.git
    targetRevision: feat/deployment-allow-zero-replicas-with-configuable-hpa
    path: traefik
    helm:
      values: |-
        deployment:
          replicas: 0
        autoscaling:
          enabled: true
          minReplicas: 5
          maxReplicas: 50
          metrics:
            - type: Resource
              resource:
                name: cpu
                target:
                  type: Utilization
                  averageUtilization: 80
          scaleTargetRef:
            apiVersion: argoproj.io/v1alpha1
            kind: Rollout
        extraObjects:
          - |
            apiVersion: argoproj.io/v1alpha1
            kind: Rollout
            metadata:
              name: {{ template "traefik.fullname" . }}
            spec:
              workloadRef:
                apiVersion: apps/v1
                kind: Deployment
                name: {{ template "traefik.fullname" . }}
              strategy:
                canary:
                  steps:
                  - setWeight: 10
                  - pause:
                      duration: 5m
  project: default
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
      - ServerSideApply=true

image

na2na-p avatar Jun 16 '25 14:06 na2na-p

cc @jnoordsij @darkweaver87 for 2nd review

mloiseleur avatar Jun 24 '25 13:06 mloiseleur