istio-operator icon indicating copy to clipboard operation
istio-operator copied to clipboard

Support multiple istio control planes

Open f4tq opened this issue 5 years ago • 10 comments

Is your feature request related to a problem? Please describe. Istio supports multiple control plane as described here . It seems to me the manual update of helm templates describe therein would be way easier using banzai operator were it to not assume it was a singleton. Describe the solution you'd like to see Remove the constraint on a singleton istio control plane instance in the validation controller. There are several pilot command line changes that need to occur as mentioned above i.e limiting namespace overlap so 2 pilots don't collide on a watched namespace & istio deployment ns but it seems manageable from within the banzai controller as it would have the ability to compare all the istio crd (control plane) instances and enforce namespace boundaries (ns-foo) with additional pilot options:

--snip--
    - name: discovery
       image: docker.io/<user ID>/pilot:<tag>
       imagePullPolicy: IfNotPresent
       args: ["discovery", "-v", "2", "--admission-service", "istio-pilot", "--appNamespace", "ns-foo"]
--snip--

A single controller, such as istio-operator, could prevent pilot watching overlapping namespaces, and of course, deploying itself into istio-system.

Describe alternatives you've considered

  • Not using banzai and following https://istio.io/blog/2018/soft-multitenancy/ which is undesirable
  • The fairly extreme alibaba solution

Additional context The use of multiple Istio control planes has the ability to limit the proliferation of super complicated alternatives recently brought up in the last k8 multi-tenancy working group. The notion of super-masters that alibaba proposes seems extreme but maybe necessary. However, it seems worth a try using a lighter weight istio implementation that has multiple control planes especially given the presence of envoy for ingress and egress. The pod-2-pod comm can be naturally controlled by the tenant operator (envoy) using separate ingress/egress. Of course, for true hard multitenancy, it may be necessary to convolute everything but I'd like to try istio first.

Thoughts?

f4tq avatar May 13 '19 17:05 f4tq

@f4tq thank you for the detailed feature request. I think it absolutely makes sense, it came up earlier in our discussions as well. But we haven’t had the resources to start planning the feature and working on a proper solution yet. That’s why we’ve settled on the singleton solution for now. We'll take a deeper look on your proposal and get back to you soon. Would you be interested in helping us implement it?

martonsereg avatar May 13 '19 18:05 martonsereg

Well, It looks like banzaicloud/istio-operator will get rolled into istio/operator.

Right?

f4tq avatar May 15 '19 02:05 f4tq

In a way yes - we are part of that team and about to push/add our work there, with some observations as:

  • The official Istio operator’s scope is currently less than the Banzai Cloud one (for now it stops around installs and upgrades)
  • We (Banzai Cloud) have a similar multi- and hybrid- cloud project as Google Anthos, Pipeline - built on our version of the operator, thus we need to operate multi clusters and have additional features which will not be (or not available yet) in the scope of the official operator, but we need those features today
  • Once the official operator reaches the maturity and minimal scope for installs and updates, we will remove that part from our operator, and add only features on top (those we need to operate hybrid or multi-clusters, CNI, service affinities, etc) - probably we will rename the repo as well.
  • Our common goal (currently Google, RedHat and Banzai Cloud is working on the official operator) is to make the official operator available for the community with support for installs/upgrades and see how it goes from there. Ideally, we (Banzai Cloud) would like to have only one operator - the official one (with the similar feature set/roadmap as ours, as that would fulfill all our needs for the hybrid cloud products we are building on top of this with Pipeline)

matyix avatar May 15 '19 09:05 matyix

Agreed: The official one would be great - at least from a Go structure/CRD perspective. I'm a little worried about a monolithic CRD as we have to deal with multi-tenant situations which also require istio gateway additions/subtractions for groups sharing a control plan (not supported in banzai).

istio/operator PR 2 focuses on the initial Go structures. They look very similar to banzais. I just commented on that.

Would you move to use those Go structures once PR 2 gets merged? Or at least when the follow-up CRD PR mapping to them gets merged?

I, too, need something to work sooner than later. I think I could make multiple control planes happen against your code base but it may be throw away work.

Pondering...

f4tq avatar May 15 '19 19:05 f4tq

@f4tq even without multi-tenancy it seems like a good idea to define more CRDs for the different components or features, the Gateway that you mentioned is a particularly good example to that.

We already discussed a bit about that internally and I'm actually in favour of several smaller CRDs, but as multi-tenancy support was not on the short term roadmap we are also pondering about this problem and did not crack it just yet.

Moving from a monolithic CRD to separate CRDs wouldn’t be an easy transition in the future so we agree that it would make sense considering it in the planning, or in the first release of the official operator.

I just created a separate issue (#206) for discussing it further as it is not strictly necessary for the multi-tenancy support.

waynz0r avatar May 15 '19 20:05 waynz0r

What would be needed to push this forward?

kubaj avatar Sep 25 '19 15:09 kubaj

Hi @kubaj,

Our experience running multiple control planes was very elusive and run into lots of issues. While we believe that we understand Istio quite well, and running pretty large (in size and number) clusters for our customers, providing proper multi-tenancy support with all the configuration options (and their dependencies) and the complexity to make sure the tenants don't step on each other's toes (from config and maintenance perspective) is quite a hassle.

While we are working on validation right now, validating these kind of configs are bringing complexities which are not documented and we had to dig/check code - now imagine this for CR's in a multi-tenancy setup). Long story short, we decided to push this back, learn and listen from those who are running setups as such - and move forward as Istio multi-tenancy support progresses and became actually stable and battle tested.

Automatically creating and configuring K8s clusters is fairly cheap (at least for us as we have automated mesh setups with Pipeline) and at this point we stick with one mesh per K8s cluster(s).

I would be grateful to learn about the why's and whether you run such a use case in production now.

For including/excluding different Istio components from a mesh it's a different story (I need this Pilot version, don't need Mixer, certs for mTLS are generated in a different way, etc.) - and it is working for some of the components right now, the rest is on the roadmap.

waynz0r avatar Sep 26 '19 18:09 waynz0r

Thank you, good to know. The problem with our setup is that we have a big cluster, and it would be difficult to split it now. We will think about splitting the cluster in the future.

kubaj avatar Oct 20 '19 17:10 kubaj

Tried to build multiple control plane using Banzaicloud operator but it didnt work, I guess this is pending still and no roadmap when this will be done.. $ kubectl create -n istio-system-mobility-oam -f config/samples/istio_v1beta1_istio.yaml Error from server (istio config resource already exists): error when creating "config/samples/istio_v1beta1_istio.yaml": admission webhook "istio.validation.banzaicloud.io" denied the request: istio config resource already exists

sb1975 avatar Jun 03 '20 04:06 sb1975

Hi @sb1975,

You are right, it is not possible to create multiple control planes just yet, but it is on our roadmap to make this work properly.

May I ask what is your use case you are trying to solve with this approach?

Laci21 avatar Jun 08 '20 15:06 Laci21