flagger icon indicating copy to clipboard operation
flagger copied to clipboard

post-rollout webhhok not exectued

Open gedankennebel opened this issue 2 years ago • 4 comments

Describe the bug

We have activated the event and post-rollout webhook and canary analysis is skipped so far. However, we never get triggered by the post-rollout webhook. With the event webhook we get this message "Promotion completed! Canary analysis was skipped for my-test-app". But no post-rollout webhook call.

To Reproduce

Our canaries YAML:

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  annotations:
    meta.helm.sh/release-name: template-service
    meta.helm.sh/release-namespace: acp-tui
  creationTimestamp: "2022-05-11T14:48:50Z"
  generation: 1
  labels:
    app.kubernetes.io/chart: microservice
    app.kubernetes.io/instance: template-service
    app.kubernetes.io/managed-by: Helm
    app_name: template-service
    helm.sh/chart: microservice-0.19.0
    helm.toolkit.fluxcd.io/name: template-service
    helm.toolkit.fluxcd.io/namespace: acp-tui
  name: template-service
  namespace: acp-tui
  resourceVersion: "217131234"
  uid: 15b4fd7e-4491-40b0-88b5-1ba4b88b31c2
spec:
  analysis:
    interval: 30s
    iterations: 1
    metrics:
    - interval: 1m
      name: request-success-rate
      threshold: 99
    - interval: 1m
      name: request-duration
      threshold: 500
    threshold: 10
    webhooks:
    - name: Post Rollout Events
      timeout: 5s
      type: post-rollout
      url: https://webhook.site/XXXXXX
    - name: Flagger events
      type: event
      url: https://webhook.site/XXXXXX
  autoscalerRef:
    apiVersion: autoscaling/v2beta1
    kind: HorizontalPodAutoscaler
    name: template-service
  service:
    port: 8080
  skipAnalysis: true
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: template-service
status:
  canaryWeight: 0
  conditions:
  - lastTransitionTime: "2022-05-11T15:16:33Z"
    lastUpdateTime: "2022-05-11T15:16:33Z"
    message: Canary analysis completed successfully, promotion finished.
    reason: Succeeded
    status: "True"
    type: Promoted
  failedChecks: 0
  iterations: 0
  lastAppliedSpec: 864769877f
  lastTransitionTime: "2022-05-11T15:16:33Z"
  phase: Succeeded
  trackedConfigs:
    secret/config-server-secret: 110e449ecf8c3025
    secret/tui-logging-index: 9ce158b29ac19da7

This is what we get from a describe on a canaries resource:

│ Status:                                                                                                                                                                                  │
│   Canary Weight:  0                                                                                                                                                                      │
│   Conditions:                                                                                                                                                                            │
│     Last Transition Time:  2022-05-11T15:16:33Z                                                                                                                                          │
│     Last Update Time:      2022-05-11T15:16:33Z                                                                                                                                          │
│     Message:               Canary analysis completed successfully, promotion finished.                                                                                                   │
│     Reason:                Succeeded                                                                                                                                                     │
│     Status:                True                                                                                                                                                          │
│     Type:                  Promoted                                                                                                                                                      │
│   Failed Checks:           0                                                                                                                                                             │
│   Iterations:              0                                                                                                                                                             │
│   Last Applied Spec:       864769877f                                                                                                                                                    │
│   Last Transition Time:    2022-05-11T15:16:33Z                                                                                                                                          │
│   Phase:                   Succeeded                                                                                                                                                     │
│   Tracked Configs:                                                                                                                                                                       │
│     secret/config-server-secret:  110e449ecf8c3025                                                                                                                                       │
│     secret/tui-logging-index:     9ce158b29ac19da7                                                                                                                                       │
│ Events:                                                                                                                                                                                  │
│   Type    Reason  Age   From     Message                                                                                                                                                 │
│   ----    ------  ----  ----     -------                                                                                                                                                 │
│   Normal  Synced  47m   flagger  Initialization done! template-service.acp-tui                                                                                                           │
│   Normal  Synced  20m   flagger  New revision detected! Scaling up template-service.acp-tui                                                                                              │
│   Normal  Synced  19m   flagger  Copying template-service.acp-tui template spec to template-service-primary.acp-tui                                                                      │
│   Normal  Synced  19m   flagger  Promotion completed! Canary analysis was skipped for template-service.acp-tui 

Expected behavior

Regarding the documentation for webhooks: https://docs.flagger.app/usage/webhooks

post-rollout hooks are executed after the canary has been promoted or rolled back. If a post rollout hook fails the error is logged.

We expect to get triggered via post-rollout webhook, after canary has been promoted.

Additional context

  • Flagger version: 1.19.0
  • Kubernetes version: v1.22.6-eks-7d68063
  • Service Mesh provider: linkerd
  • Ingress provider: AWS Application Load Balancer (ALB)

gedankennebel avatar May 11 '22 16:05 gedankennebel

Does it work with skipAnalysis: false?

stefanprodan avatar May 11 '22 16:05 stefanprodan

Hi Stefan, thanks for you quick answer. I tried with skipAnalysis: false and now it worked. I also received more events at our webhook.

By the way, regarding the skipAnalysis in the FAQ section:

Why is there a window of downtime during the canary initializing process when analysis is disabled?

A window of downtime is the intended behavior when the analysis is disabled. 
This allows instant rollback and also mimics the way a Kubernetes deployment initialization works. 
To avoid this, enable the analysis (skipAnalysis: true), wait for the initialization to finish, and disable it afterward (skipAnalysis: false).

https://docs.flagger.app/faq#why-is-there-a-window-of-downtime-during-the-canary-initializing-process-when-analysis-is-disabled

should it not be ... enable the analysis (skipAnalysis: false) and vice versa ? 🤔

gedankennebel avatar May 13 '22 08:05 gedankennebel

@aryan9600 Is this issue still relevant?

AhmedGrati avatar Mar 15 '23 10:03 AhmedGrati

@stefanprodan does this mean post-rollout hook is not triggered at all if skipAnalysis is true? Shouldn't post-rollout hook trigger once canary is promoted regardless of analysis was done or not?

soumeng09 avatar Feb 19 '24 17:02 soumeng09