flagger icon indicating copy to clipboard operation
flagger copied to clipboard

Webhooks are not triggered when installing on local kubernetes cluster

Open saninsignify opened this issue 1 year ago • 0 comments

Describe the bug

Using a local kubernetes cluster via Docker Desktop and following the instruction process defined here - https://docs.flagger.app/install/flagger-install-on-kubernetes , along with a local LinkerD setup defined here https://linkerd.io/2.16/getting-started/ , when setting up a test canary deployment with podinfo, the webhooks to flaggerloadtester are never called and flagger always removes the canary and promotes to primary.

To Reproduce

  1. Set up a local kubernetes cluster using docker desktop. (I did it on a Mac)
  2. Install LinkerD - https://linkerd.io/2.16/getting-started/ and LinkerD Viz Dashboard
  3. Install Flagger - https://docs.flagger.app/install/flagger-install-on-kubernetes and the flagger-loadtester
  4. Do a deployment of podinfo image 6.6.1 via Kustomize or a deployment.yaml
  5. Do the canary deployment defined below.
  6. Change the deployment of podinfo to image 6.7.0 and observe that canary status is stuck in "Initializing" and that the webhook for flagger-loadtester is never called and the deployment automatically gets promoted.

Canary Deployment file

apiVersion: flagger.app/v1beta1 kind: Canary metadata: name: podinfo namespace: test spec:

deployment reference

targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo progressDeadlineSeconds: 60 service: # ClusterIP port number port: 80 # container port number or name (optional) targetPort: 9898 skipAnalysis: false analysis: # schedule interval (default 60s) interval: 10s # max number of failed metric checks before rollback threshold: 3 # A/B test interactions # iterations: 1 maxWeight: 15 stepWeight: 5 stepWeightPromotion: 50 # Linkerd Prometheus checks metrics: - name: request-success-rate thresholdRange: min: 50 interval: 1m - name: request-duration thresholdRange: max: 500 interval: 30s webhooks: - name: "confirmation gate" type: confirm-promotion url: http://flagger-loadtester.test/gate/halt

Expected behavior

6.7.0 is still stuck in canary because the "confirm-promotion" gate is getting a 403 from http://flagger-loadtester.test/gate/halt

Additional context

  • Flagger version: 1.38
  • Kubernetes version: 1.29.5
  • Service Mesh provider: LinkerD edge-24.8.2
  • Ingress provider: none

saninsignify avatar Aug 29 '24 14:08 saninsignify