flagger drops virtualhost object from contour's HTTPProxy definition with A/B canary rollout
Describe the bug
We are trying to use flagger in our progressive delivery efforts particularly in A/B testing scenario.
Ingress provider we use is Contour while the mesh provider is linkerd.
Testing of weighted canary deployments with linkerd was successful, but A/B testing with Contour went into a failure.
When starting the canary iteration flagger modifies HTTPProxy resource and drops virtualhost object from it.
Which causes the HTTPProxy to become orphaned.
Just a note: we are not using nested HTTPProxy definitions so each deployment defines it's own root HTTPProxy.
Another thing:
flagger doesn't roll back the HTTPProxy CRD to the original state after unsuccessful rollout resulting the application to stop serving requests.
To Reproduce
Canary definition
---
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: devops-test-app
namespace: devops
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: devops-test-app
autoscalerRef:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
name: devops-test-app
progressDeadlineSeconds: 60
service:
port: 5000
targetPort: 5000
provider: contour
analysis:
interval: 1m
threshold: 5
iterations: 10
match:
- headers:
contoso-test:
exact: "integration"
metrics:
- name: request-success-rate
thresholdRange:
min: 99
interval: 2m
- name: request-duration
thresholdRange:
max: 500
interval: 2m
Original HTTPProxy CRD
---
apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
name: devops-test-app
namespace: devops
labels:
app.kubernetes.io/name: devops-test-app
app.kubernetes.io/version: "0.1.0"
spec:
virtualhost:
fqdn: devops-test-app.contoso.com
tls:
secretName: star-cert
routes:
- services:
- name: devops-test-app
port: 5000
requestHeadersPolicy:
set:
- name: l5d-dst-override
value: "%CONTOUR_SERVICE_NAME%.%CONTOUR_NAMESPACE%.svc.cluster.local:%CONTOUR_SERVICE_PORT%"
responseHeaderPolicy:
remove:
- l5d-client-id
timeoutPolicy:
response: 30s
HTTPProxy CRD after flagger starts canary advancement
---
apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
name: devops-test-app
namespace: devops
spec:
routes:
- conditions:
- header:
exact: integration
name: contoso-test
prefix: /
services:
- name: devops-test-app-primary
port: 5000
requestHeadersPolicy:
set:
- name: l5d-dst-override
value: devops-test-app-primary.devops.svc.cluster.local:5000
- name: devops-test-app-canary
port: 5000
requestHeadersPolicy:
set:
- name: l5d-dst-override
value: devops-test-app-canary.devops.svc.cluster.local:5000
weight: 100
- conditions:
- prefix: /
services:
- name: devops-test-app-primary
port: 5000
requestHeadersPolicy:
set:
- name: l5d-dst-override
value: devops-test-app-primary.devops.svc.cluster.local:5000
weight: 100
- name: devops-test-app-canary
port: 5000
requestHeadersPolicy:
set:
- name: l5d-dst-override
value: devops-test-app-canary.devops.svc.cluster.local:5000
status:
conditions:
- errors:
- message: this HTTPProxy is not part of a delegation chain from a root HTTPProxy
reason: Orphaned
status: "True"
type: Orphaned
lastTransitionTime: ""
message: At least one error present, see Errors for details
observedGeneration: 3
reason: ErrorPresent
status: "False"
type: Valid
currentStatus: orphaned
description: this HTTPProxy is not part of a delegation chain from a root HTTPProxy
loadBalancer:
ingress:
- ip: 1.1.1.1
Expected behavior
- flagger keeps
virtualhostobject definition inHTTPProxyduring canary advancement in A/B testing scenartio. - flagger rolls back
HTTPProxyCRD to the original state if the rollout was unsuccessful.
Additional context
- Flagger version: 1.31.0
- Kubernetes version: 1.26.3
- Service Mesh provider: linkerd stable-2.13.5
- Ingress provider: contour 1.24.2
The Contour integration is built in such a way that it expects to create the HTTPProxy object itself using the values provided in the Canary definition and so it ends up overriding any existing HTTPProxy objects.
Users are expected to create another HTTPProxy with the desired .spec.virtualHost which will include the one generated by Flagger:
apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
name: devops-test-app-ingress
namespace: devops
spec:
virtualhost:
fqdn: devops-test-app.contoso.com
tls:
secretName: star-cert
includes:
- name: podinfo
namespace: test
conditions:
- prefix: /
I recommend you go through the tutorial once: https://fluxcd.io/flagger/tutorials/contour-progressive-delivery/
Thanks for your reply. Why would flagger then touch existing HTTPProxy objects bound to a certain deployment?