flagger
flagger copied to clipboard
post-rollout webhhok not exectued
Describe the bug
We have activated the event and post-rollout webhook and canary analysis is skipped so far. However, we never get triggered by the post-rollout webhook. With the event webhook we get this message "Promotion completed! Canary analysis was skipped for my-test-app". But no post-rollout webhook call.
To Reproduce
Our canaries YAML:
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
annotations:
meta.helm.sh/release-name: template-service
meta.helm.sh/release-namespace: acp-tui
creationTimestamp: "2022-05-11T14:48:50Z"
generation: 1
labels:
app.kubernetes.io/chart: microservice
app.kubernetes.io/instance: template-service
app.kubernetes.io/managed-by: Helm
app_name: template-service
helm.sh/chart: microservice-0.19.0
helm.toolkit.fluxcd.io/name: template-service
helm.toolkit.fluxcd.io/namespace: acp-tui
name: template-service
namespace: acp-tui
resourceVersion: "217131234"
uid: 15b4fd7e-4491-40b0-88b5-1ba4b88b31c2
spec:
analysis:
interval: 30s
iterations: 1
metrics:
- interval: 1m
name: request-success-rate
threshold: 99
- interval: 1m
name: request-duration
threshold: 500
threshold: 10
webhooks:
- name: Post Rollout Events
timeout: 5s
type: post-rollout
url: https://webhook.site/XXXXXX
- name: Flagger events
type: event
url: https://webhook.site/XXXXXX
autoscalerRef:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
name: template-service
service:
port: 8080
skipAnalysis: true
targetRef:
apiVersion: apps/v1
kind: Deployment
name: template-service
status:
canaryWeight: 0
conditions:
- lastTransitionTime: "2022-05-11T15:16:33Z"
lastUpdateTime: "2022-05-11T15:16:33Z"
message: Canary analysis completed successfully, promotion finished.
reason: Succeeded
status: "True"
type: Promoted
failedChecks: 0
iterations: 0
lastAppliedSpec: 864769877f
lastTransitionTime: "2022-05-11T15:16:33Z"
phase: Succeeded
trackedConfigs:
secret/config-server-secret: 110e449ecf8c3025
secret/tui-logging-index: 9ce158b29ac19da7
This is what we get from a describe on a canaries resource:
│ Status: │
│ Canary Weight: 0 │
│ Conditions: │
│ Last Transition Time: 2022-05-11T15:16:33Z │
│ Last Update Time: 2022-05-11T15:16:33Z │
│ Message: Canary analysis completed successfully, promotion finished. │
│ Reason: Succeeded │
│ Status: True │
│ Type: Promoted │
│ Failed Checks: 0 │
│ Iterations: 0 │
│ Last Applied Spec: 864769877f │
│ Last Transition Time: 2022-05-11T15:16:33Z │
│ Phase: Succeeded │
│ Tracked Configs: │
│ secret/config-server-secret: 110e449ecf8c3025 │
│ secret/tui-logging-index: 9ce158b29ac19da7 │
│ Events: │
│ Type Reason Age From Message │
│ ---- ------ ---- ---- ------- │
│ Normal Synced 47m flagger Initialization done! template-service.acp-tui │
│ Normal Synced 20m flagger New revision detected! Scaling up template-service.acp-tui │
│ Normal Synced 19m flagger Copying template-service.acp-tui template spec to template-service-primary.acp-tui │
│ Normal Synced 19m flagger Promotion completed! Canary analysis was skipped for template-service.acp-tui
Expected behavior
Regarding the documentation for webhooks: https://docs.flagger.app/usage/webhooks
post-rollout hooks are executed after the canary has been promoted or rolled back. If a post rollout hook fails the error is logged.
We expect to get triggered via post-rollout webhook, after canary has been promoted.
Additional context
- Flagger version: 1.19.0
- Kubernetes version: v1.22.6-eks-7d68063
- Service Mesh provider: linkerd
- Ingress provider: AWS Application Load Balancer (ALB)
Does it work with skipAnalysis: false
?
Hi Stefan, thanks for you quick answer. I tried with skipAnalysis: false
and now it worked. I also received more events at our webhook.
By the way, regarding the skipAnalysis in the FAQ section:
Why is there a window of downtime during the canary initializing process when analysis is disabled?
A window of downtime is the intended behavior when the analysis is disabled.
This allows instant rollback and also mimics the way a Kubernetes deployment initialization works.
To avoid this, enable the analysis (skipAnalysis: true), wait for the initialization to finish, and disable it afterward (skipAnalysis: false).
https://docs.flagger.app/faq#why-is-there-a-window-of-downtime-during-the-canary-initializing-process-when-analysis-is-disabled
should it not be ... enable the analysis (skipAnalysis: false)
and vice versa ? 🤔
@aryan9600 Is this issue still relevant?
@stefanprodan does this mean post-rollout hook is not triggered at all if skipAnalysis is true? Shouldn't post-rollout hook trigger once canary is promoted regardless of analysis was done or not?