istio icon indicating copy to clipboard operation
istio copied to clipboard

istio-ca-root-cert CM not created in revisioned CPs with different DiscoverySelectors

Open RicHincapie opened this issue 6 months ago • 1 comments

Is this the right place to submit this?

  • [x] This is not a security vulnerability or a crashing bug
  • [x] This is not a question about how to use Istio

Bug Description

Pods stay in init-state in new namespaces due to lack of istio-ca-root-cert:

  —————»  ns:istio-system ❯ kgp -A | grep httpbin
httpbin122-v2        netshoot                                                    0/2     Init:0/1   0          28m
httpbin122           httpbin-686d6fc899-nhfgj                                    0/2     Init:0/1   0          31m
httpbin123           httpbin-686d6fc899-fzrcr                                    2/2     Running    0          31m
  —————»  ns:istio-system ❯ kg cm -A | grep httpbin
httpbin122-v2        kube-root-ca.crt                                       1      30m
httpbin122           kube-root-ca.crt                                       1      32m
httpbin123           istio-ca-root-cert                                     1      30m
httpbin123           kube-root-ca.crt                                       1      32m
  —————»  ns:istio-system ❯ kgp
NAME                           READY   STATUS    RESTARTS   AGE
istiod-1-22-5897bcfb64-bkkpw   1/1     Running   0          61m
istiod-1-23-6bb954684c-qfq5v   1/1     Running   0          64m
  —————»  ns:istio-system ❯ kg ns --show-labels | grep httpbin
httpbin122           Active   51m   istio-discovery=istio-1-22,istio.io/rev=1-22,kubernetes.io/metadata.name=httpbin122
httpbin122-v2        Active   49m   istio-discovery=istio-1-22,istio.io/rev=1-22,kubernetes.io/metadata.name=httpbin122-v2
httpbin123           Active   51m   istio-discovery=istio-1-23,istio.io/rev=1-23,kubernetes.io/metadata.name=httpbin123

Notice the non working ones are attached to the first revision installed.

Once changed the ns label istio-discovery from istio-1-22 to 1-23, the secret is created and pods come up.

  —————»  ns:istio-system ❯ kg ns --show-labels | grep httpbin
httpbin122           Active   64m   istio-discovery=istio-1-23,istio.io/rev=1-23,kubernetes.io/metadata.name=httpbin122
httpbin122-v2        Active   62m   istio-discovery=istio-1-22,istio.io/rev=1-22,kubernetes.io/metadata.name=httpbin122-v2
httpbin123           Active   64m   istio-discovery=istio-1-23,istio.io/rev=1-23,kubernetes.io/metadata.name=httpbin123
  —————»  ns:istio-system ❯ kg cm -A | grep httpbin
httpbin122-v2        kube-root-ca.crt                                       1      49m
httpbin122           istio-ca-root-cert                                     1      22s
httpbin122           kube-root-ca.crt                                       1      51m
httpbin123           istio-ca-root-cert                                     1      49m
httpbin123           kube-root-ca.crt                                       1      51m

Also, if istiod-1-23 is escalated to 0, the ns httpbin122-v2 istio-ca-root-cert cm is created immediately:

  —————»  ns:istio-system ❯ k scale deploy istiod-1-23 --replicas 0
deployment.apps/istiod-1-23 scaled
  —————»  ns:istio-system ❯ kg cm -A | grep httpbin
httpbin122-v2        istio-ca-root-cert                                     1      4s
httpbin122-v2        kube-root-ca.crt                                       1      64m
httpbin122           istio-ca-root-cert                                     1      15m
httpbin122           kube-root-ca.crt                                       1      66m
httpbin123           istio-ca-root-cert                                     1      64m
httpbin123           kube-root-ca.crt                                       1      66m

If I then create a new ns for rev 1-23, the issue repeats:

  —————»  ns:istio-system ❯ kgp -A | grep httpbin
httpbin122-v2        netshoot                                                            2/2     Running           0              94m
httpbin122           httpbin-6b69b57f7c-rmhjf                                    2/2     Running            0             44m
httpbin123-v2        httpbin-686d6fc899-pjw2s                                    0/2     Init:0/1           0             9s
httpbin123           httpbin-686d6fc899-fzrcr                                    2/2     Running            0             96m

This is reproducible up to v1.26

Version

$ is version
client version: 1.26.0
istiod version: 1.22.0
istiod version: 1.23.0

Additional Information

No response

RicHincapie avatar Jun 12 '25 11:06 RicHincapie

Also reported here: https://github.com/istio/istio/issues/52988

morepork avatar Jun 18 '25 04:06 morepork

Can someone provide more information about the use-case for different discoverySelectors per revision?

keithmattix avatar Jul 02 '25 17:07 keithmattix

This is a QA cluster where a virtual division of different versioned and parallel CP was required by the platform team. Not having the full motives, the reasons for this may be:

  • Lowering QA costs
  • Parallel testing
  • Multi tenancy

RicHincapie avatar Jul 03 '25 05:07 RicHincapie

This is a QA cluster where a virtual division of different versioned and parallel CP was required by the platform team. Not having the full motives, the reasons for this may be:

  • Lowering QA costs
  • Parallel testing
  • Multi tenancy

@keithmattix @therealmitchconnors does this make sense to you?

zirain avatar Jul 07 '25 08:07 zirain

This seems like more complexity than Discovery Selectors was really intended to cover. Multitenancy is far more complex than simply allowing multiple meshes in one cluster, and is definitely not supported. As for QA costs and parallel testing, I'd recommend looking into cluster virtualization technology, like vCluster or kind.

therealmitchconnors avatar Jul 08 '25 00:07 therealmitchconnors

Ok. Is it worth documenting this limitation here? https://istio.io/latest/docs/reference/config/istio.mesh.v1alpha1/ I could do it.

RicHincapie avatar Jul 10 '25 03:07 RicHincapie

@therealmitchconnors @keithmattix We have also experienced it, we have a eks with 4 revision deployed. The configmap is not created automatically for new namespaces even though the namespace has discoverySelector from one of the revision. We need multi revision setup in same cluster due to scale of our cluster can reach 80k pods If we use single istio for the whole cluster, we got many problems with scalibility and endpoints.

So, our use case is to run multiple revisions and high pod densities namespaces are managed by dedicated revisions.

I also added an issue https://github.com/istio/istio/issues/57651#issuecomment-3312169883

zakariais avatar Sep 22 '25 09:09 zakariais