argo-rollouts icon indicating copy to clipboard operation
argo-rollouts copied to clipboard

A VirtualService that worked in version 2.21.0 is causing an error in the latest version.

Open finda-yeongjo opened this issue 2 years ago • 0 comments

Checklist:

  • [x] I've included steps to reproduce the bug.
  • [x] I've included the version of argo rollouts.

Describe the bug

I was originally running pods in default namespace, but considering the security requirements, I changed all of them from the Stage environment to, for example, test, namespace.

I have a route rule set in public-vs for services that are accessed directly from the internet, and private-vs for services that only access internally without external access.

My rollout resources use spec.strategy.canary.trafficRouting.istio.virtualService.name , and as previously described, if there is no route in the vs that each uses, HEALTH DETAILS fails and if it exists, it succeeds.

In my private-vs, both http route and tcp route exist.

http:
    - name: yeongjo-test-route
      route:
        - destination:
            host: yeongjo-test-server-svc.backend.svc.cluster.local
            port:
              number: 80
            subset: stable
          weight: 100
        - destination:
            host: yeongjo-test-server-svc.backend.svc.cluster.local
            port:
              number: 80
            subset: canary
          weight: 0
tcp:
    - match:
        - port: SOMEPORT
      route:
        - destination:
            host: yeongjo-test-server-thread-svc.backend.svc.cluster.local
            port:
              number: 80
          weight: 100

I keep the exact same Stage-Production settings, and I'm currently working on Stage first. As soon as I change the namespace, I get an error with the log I'm writing below. If it's solved right away, it works fine when I remove the tcp rule from private-vs. But because of the allow host, I can't separate the tcp rule into a different vs.

The Production environment works well with the same virtual service config. The stage vs config has not changed the route rule. The only difference between the two environments is the difference in the version of the argo-rollouts. Was there an update on this? It's a very fatal issue in our production environment.

What's the problem?

To Reproduce In default namespace, configure the istio gateway (where the istio ingressgateway is located in the istio-system), virtualservice, and service pods.

And change all default namespace resources to another namespace as SOMETHING and apply them

Expected behavior

  • Operation well without error.

Screenshots

Version

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: argo-rollouts
  namespace: argocd
....
  source:
    repoURL: https://argoproj.github.io/argo-helm
    targetRevision: 2.21.0 # Occurs when upgrading to 2.34.3 or later versions
    chart: argo-rollouts
    helm: ......

Logs

- InvalidSpec: The Rollout "yeongjo-test-server-rollout" is invalid: spec.strategy.canary.trafficRouting.istio.virtualService.name: Invalid value: "yeongjo-private-vs": Istio VirtualService has invalid TCP routes. Error: Canary DestinationRule subset 'canary' not found in route

Logs for the entire controller:

kubectl logs -n argo-rollouts deployment/argo-rollouts

time="2024-03-20T10:02:35Z" level=info msg="Started syncing rollout" generation=129 namespace=backend resourceVersion=244653624 rollout=devops-tester-rollout
time="2024-03-20T10:02:35Z" level=error msg="The Rollout \"devops-tester-rollout\" is invalid: spec.strategy.canary.trafficRouting.istio.virtualService.name: Invalid value: \"yeongjo-private-vs\": Istio VirtualService has invalid TCP routes. Error: Canary DestinationRule subset 'canary' not found in route" namespace=backend rollout=devops-tester-rollout
time="2024-03-20T10:02:35Z" level=info msg="Reconciliation completed" generation=129 namespace=backend resourceVersion=244653624 rollout=devops-tester-rollout time_ms=3.329985

Logs for a specific rollout:

kubectl logs -n argo-rollouts deployment/argo-rollouts | grep rollout=<ROLLOUTNAME

It's the same as above


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

finda-yeongjo avatar Mar 20 '24 10:03 finda-yeongjo