flagger icon indicating copy to clipboard operation
flagger copied to clipboard

Canary deployment with multiple ports using Gateway API routes all traffic to first port

Open dkulchinsky opened this issue 1 year ago • 3 comments

Describe the bug

When setting up a Canary using Gateway API for a Deployment that has multiple ports, all traffic is routed to the first port.

To Reproduce

Deployment manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
  labels:
    app.kubernetes.io/component: canary-demo
    app.kubernetes.io/instance: canary-demo
    app.kubernetes.io/name: canary-demo
  name: canary-demo
spec:
  progressDeadlineSeconds: 600
  selector:
    matchLabels:
      app.kubernetes.io/component: canary-demo
      app.kubernetes.io/instance: canary-demo
      app.kubernetes.io/name: canary-demo
  template:
    metadata:
      labels:
        app.kubernetes.io/component: canary-demo
        app.kubernetes.io/instance: canary-demo
        app.kubernetes.io/name: canary-demo
    spec:
      containers:
      - args:
        - --port=9898
        command:
        - ./podinfo
        env:
        - name: PODINFO_UI_MESSAGE
          value: canary-demo web1
        image: ghcr.io/stefanprodan/podinfo:6.7.1
        name: web1
        ports:
        - containerPort: 9898
          protocol: TCP
      - args:
        - --port=9899
        command:
        - ./podinfo
        env:
        - name: PODINFO_UI_MESSAGE
          value: canary-demo web2
        image: ghcr.io/stefanprodan/podinfo:6.5.4
        name: web2
        ports:
        - containerPort: 9899
          protocol: TCP

Canary manifest:

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  labels:
    app.kubernetes.io/component: canary-demo
    app.kubernetes.io/instance: canary-demo
    app.kubernetes.io/name: canary-demo
  name: canary-demo
spec:
  analysis:
    interval: 60s
    maxWeight: 70
    metrics:
    - name: request-success-rate
      thresholdRange:
        min: 60
    stepWeight: 10
    threshold: 5
    webhooks:
    - metadata:
        cmd: curl -s http://canary-demo-svc.cd-demo:9898/
        type: bash
      name: acceptance-test
      timeout: 30s
      type: pre-rollout
      url: http://flagger-loadtester.flagger-testing/
    - metadata:
        cmd: hey -z 2m -q 15 -c 5 http://canary-demo-svc.cd-demo:9898/index.html
      name: load-test
      timeout: 30s
      type: rollout
      url: http://flagger-loadtester.flagger-testing/
  progressDeadlineSeconds: 60
  provider: gatewayapi:v1beta1
  service:
    gatewayRefs:
    - group: core
      kind: Service
      name: canary-demo-svc
      namespace: cd-demo
    name: canary-demo-svc
    port: 9898
    portDiscovery: true
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: canary-demo

Once the above is deployed, the following HTTPRoute is created:

apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: canary-demo-svc
  ownerReferences:
  - apiVersion: flagger.app/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: Canary
    name: canary-demo
    uid: cabc846a-1a90-4b0c-b7c2-56f3753afa28
spec:
  parentRefs:
  - group: core
    kind: Service
    name: canary-demo-svc
    namespace: cd-demo
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: canary-demo-svc-primary
      port: 9898
      weight: 100
    - group: ""
      kind: Service
      name: canary-demo-svc-canary
      port: 9898
      weight: 0
    matches:
    - path:
        type: PathPrefix
        value: /

the HTTPRoute seem to only takes into account port 9898

Once a canary is progressing/succeeded, hitting the canary-demo-svc Service in port 9898 or 9899 end up reaching web1 container that listens on port 9898.

Expected behavior

Requests are routed to correct port.

The same exact setup using linkerd provider instead of gatewayapi:v1beta1 works as expected, in both cases I'm enabling portDiscovery but this seem to only work as expected with linkerd provider.

Additional context

  • Flagger version: 1.39.0
  • Kubernetes version: 1.30.5
  • Service Mesh provider: Linkerd (although using gateway api in this case)
  • Ingress provider: NGINX Ingress Controller

dkulchinsky avatar Dec 02 '24 19:12 dkulchinsky

the portDiscovery field is used to generate k8s Services, not networking resources. a Canary object is meant for splitting traffic between two distinct Services listening on the same port. what should the HTTPRoute object look like according to you for the above scenario?

aryan9600 avatar Jan 13 '25 11:01 aryan9600

Hey @aryan9600 👋🏼 thanks for the response.

the portDiscovery field is used to generate k8s Services, not networking resources.

it wasn't clear from the docs if portDiscovery is supported when using gatewayapi, I guess it doesn't.

a Canary object is meant for splitting traffic between two distinct Services listening on the same port. what should the HTTPRoute object look like according to you for the above scenario?

I'm not sure, but sounds like when using the gatewayapi provider, a deployment with multiple ports is not supported?

dkulchinsky avatar Jan 14 '25 02:01 dkulchinsky

@aryan9600

what should the HTTPRoute object look like according to you for the above scenario?

~I believe an HTTPRoute can have multiple backendRefs, one for each discovered port?~

EDIT: reading it back, this doesn't make sense of course, since multiple backend refs are already being used to set different weights to primary & canary

What if multiple HTTPRoutes are generated, each matching the same parent service but using different ports?

Looking at the HTTPRoute spec, ParentRefs can specify the port, so building on my example above, we can define two HTTPRoutes:

  1. canary-demo-svc-9898:
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: canary-demo-svc-9898
  ownerReferences:
  - apiVersion: flagger.app/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: Canary
    name: canary-demo
    uid: cabc846a-1a90-4b0c-b7c2-56f3753afa28
spec:
  parentRefs:
  - group: core
    kind: Service
    name: canary-demo-svc
    namespace: cd-demo
    port: 9898
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: canary-demo-svc-primary
      port: 9898
      weight: 100
    - group: ""
      kind: Service
      name: canary-demo-svc-canary
      port: 9898
      weight: 0
    matches:
    - path:
        type: PathPrefix
        value: /
  1. canary-demo-svc-9899
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: canary-demo-svc-9899
  ownerReferences:
  - apiVersion: flagger.app/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: Canary
    name: canary-demo
    uid: cabc846a-1a90-4b0c-b7c2-56f3753afa28
spec:
  parentRefs:
  - group: core
    kind: Service
    name: canary-demo-svc
    namespace: cd-demo
    port: 9899
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: canary-demo-svc-primary
      port: 9899
      weight: 100
    - group: ""
      kind: Service
      name: canary-demo-svc-canary
      port: 9899
      weight: 0
    matches:
    - path:
        type: PathPrefix
        value: /

wdyt @aryan9600 ?

dkulchinsky avatar Mar 08 '25 19:03 dkulchinsky