tag-app-delivery icon indicating copy to clipboard operation
tag-app-delivery copied to clipboard

[Platforms] Publish Kubernetes multitenancy whitepaper

Open AlexsJones opened this issue 3 years ago • 31 comments

There is an ever-growing demand for guidance and resources to classify how to support multi-tenant Kubernetes clusters. In the spirit of the recently published Operator Whitepaper there is an opportunity to collaborate and aggregate industry examples to provide a rich resource for those looking to understand the history, context, and direction of multi-tenant Kubernetes architecture.

Examples:

  • Cloud providers running tenant clusters within a single parent cluster.
  • Development teams wishing to partition their resources and teams.
  • Security groups looking to ensure isolation between workload tenancy.

Currently, there are several approaches teams are following:

  • Running individual clusters per tenant
  • Running tenants in different namespaces
  • Running virtual cluster in cluster (like cluster)

Please comment below :

  • if you are delivering multi-tenant applications
  • provide infrastructure for multi-tenancy on Kubernetes
  • are involved in OSS projects related to the topic

AlexsJones avatar Feb 02 '22 10:02 AlexsJones

I think this would be a very interesting project. One aspect that's definitely worth considering is comparison between soft and hard multi-tenancy, as the security requirements of hard multi-tenancy can be tricky to achieve with stock Kubernetes.

raesene avatar Feb 02 '22 11:02 raesene

I think @sedefsavas and I could potentially be interested as we built our own multitenancy implementation for Kubernetes Cluster API here and have submitted kubecon talk about it too.

randomvariable avatar Feb 02 '22 12:02 randomvariable

See also https://github.com/kubernetes/website/issues/31479

adrianludwin avatar Feb 02 '22 13:02 adrianludwin

Thanks, @adrianludwin this feels like something we could apply a combined effort on. Whilst we certainly target Kubernetes as the substrate, the App Delivery TAG angle is going to be heavily predicated in how this pattern is used E2E and its enablement for the delivery of tenant workloads ( So the scope might vary to some degree and be slightly broader ).

That said, I would be keen to know what work has been started if any and how we can pool our resources 🚀

I do notice a few projects mentioned in the related issue and I would love to provide some additional examples to make sure we have a fair and representative view of a) the landscape of OSS and CNCF projects b) the commercial vendors and their approaches + challenges.

AlexsJones avatar Feb 02 '22 13:02 AlexsJones

I have discussed this recently with @AloisReitbauer and would be happy to give my/our five cents.

We work with various customers together to deliver multi-tenant clusters, and give guidance to the dev teams on how to tackle that.

mkorbi avatar Feb 02 '22 14:02 mkorbi

Running virtual cluster in cluster (like cluster)

@AlexsJones did you mean vcluster there in the parens? :)

Besides vcluster our commerical product Loft is focused on multi-tenancy too.

richburroughs avatar Feb 02 '22 16:02 richburroughs

We may want to differentiate between provider-side multitenancy, like how Azure or AWS support multiple organizations; and user-side multitenancy, between groups in a single organization.

joshgav avatar Feb 02 '22 16:02 joshgav

After the App Delivery TAG call the consensus is to move this into the Cooperative delivery WG and take this forward. It will be put on the docket for the next meeting and we can carry this forward.

AlexsJones avatar Feb 02 '22 17:02 AlexsJones

I'm really excited about this, as it's a topic I would love to get into the details of. A lot of room for collaboration here. 👏

roberthstrand avatar Feb 02 '22 18:02 roberthstrand

Really interested about this ! I can share our current implementation and how we are currently trying to improve it :

We ("platform" team) provides several kubernetes cluster for different projects inside our organization :

  • one K8S cluster per environment (dev, staging, production region 1, production region 2...)
  • in each K8S cluster, multiple tenants (projects) can deploy resources

So we do what you called "user-side" multitenancy

When a project wants to be hosted in our platform, they have to do Pull Request in our self-hosted gitlab 'onboarding' repository to

  • add their desired namespaces into a YAML file like :
project1:
- project1_ns1
- project1_ns2
project2:
- project2_ns1
- project2_ns2
...
  • optionally add an argocd applicationset custom resource (usually with a 'gitlab generator' for argocd appset controller to autodiscover their git 'deployment' repositories and generate corresponding argocd applications)
  • optionally add additional RBAC for specific needs

This PR is reviewed and if merged, the Jenkins pipeline initializes the tenant with an 'admin' kubeconfig to create all the resources :

  • creates desired namespaces with a label 'owner=project1' and a label 'netpol-gen=true' (will be used by a kyverno clusterpolicy to generate network policies for these namespaces)
  • generates kubeconfigs for this project (1 for the project's admin/sre/devops roles that will allow 'admin-like' permission in the project's namespaces and 1 for devs that will have less permissions)
  • generates RBAC for the project to limit their actions to their namespaces (we use RBACManager project here)
  • creates a k8s dockerregistry secret with harbor credentials to pull only project's images from harbor (the imagePullSecret of the namespace is set to point to this specific secret)
  • adds potential additional RBAC for specific needs
  • launch some tests to check the isolation and kubeconfig scope
  • applies the potential argocd applicationset resources to automatically generate all project's argocd applications

This is currently done with a shell script... (using templated YAML for resources like namespaces, RBAC...)

But we are currently study a refactoring of the repository to use :

  • kyverno to handle RBACManager resources generation (as we already doing today for network policies)
  • kyverno to handle imagePullSecret secret generation and the NS default serviceaccount corresponding modification
  • argocd to apply namespace manifests

And some other ideas to test :

  • use hierarchical namespaces to give namespaces control back to the projects : we would only handle the 'parent' namespaces creation for each project and leave them the freedom to create all the sub-namespaces they want (all the netpols, secrets, rbac would be propagated from the parent NS to the child NS)
  • use a homemade 'Project' CRD for the project to describe their needs in a declarative way and kyverno to generate all needed resources in reaction to a 'Project' CR creation and write a kyverno policy to detect any new 'Project' custom resource and generates all necessary resources in reaction (instead of writing a full controller to do this, we think kyverno can do the job) (I found it was a bit like what Capsule proposes with its 'Tenant' CRD, but it would be more specific to our needs. By the way, Kapsule may have been a nice solution for us I think if it existed at the time of we set up our onboarding process...)

Voilà ! I hope it is clear enough and it can be of any use 😅

yogeek avatar Feb 10 '22 05:02 yogeek

I am interested in this topic and can share the insights that we have gained through our customers. We have been focusing on supporting the multi-instance multi-tenancy use-case (soft multi-tenancy). Lot of our customers want to deliver k8s-native applications to their end users in a service form. Multi-instance multi-tenancy provides a natural approach towards this. We have built the KubePlus Operator that enables such a service-based delivery of Kubernetes applications. Will be happy to share the challenges that arise in this form of multi-tenancy and how we are addressing them in KubePlus.

devdattakulkarni avatar Feb 10 '22 11:02 devdattakulkarni

Thanks @yogeek and @devdattakulkarni we are currently reviewing https://github.com/cncf/tag-app-delivery/pull/197 which will enable you to post this into an issue that will greatly help us classify some of the setups out there in the wild.

I would also invite you to attend the cooperative delivery wg and/or contribute to the whitepaper from that group.

AlexsJones avatar Feb 10 '22 16:02 AlexsJones

Thanks, @adrianludwin this feels like something we could apply a combined effort on. Whilst we certainly target Kubernetes as the substrate, the App Delivery TAG angle is going to be heavily predicated in how this pattern is used E2E and its enablement for the delivery of tenant workloads ( So the scope might vary to some degree and be slightly broader ).

That said, I would be keen to know what work has been started if any and how we can pool our resources 🚀

I do notice a few projects mentioned in the related issue and I would love to provide some additional examples to make sure we have a fair and representative view of a) the landscape of OSS and CNCF projects b) the commercial vendors and their approaches + challenges.

We've started our doc outline here: https://docs.google.com/document/d/192aPEDsoJ-DWsy1GYvmQt_7tKP5wXh9MN9totE81Dx4/edit#. Feel free to drop in and leave comments. On the K8s Slack, we're at #wg-multitenancy.

adrianludwin avatar Feb 10 '22 17:02 adrianludwin

A structured form for gathering use cases and implementation details is now here: https://github.com/cncf/tag-app-delivery/issues/new/choose (select "multitenancy use case").

:eyes: @yogeek @devdattakulkarni @mkorbi @randomvariable

@AlexsJones should we add a link in your OP too?

joshgav avatar Feb 10 '22 20:02 joshgav

@adrianludwin

doc outline here: https://docs.google.com/document/d/192aPEDsoJ-DWsy1GYvmQt_7tKP5wXh9MN9totE81Dx4

Can you please make that public? Thanks!

joshgav avatar Feb 10 '22 20:02 joshgav

I'm not actually the doc owner but you should be able to ask for access from Jim.

On Thu, Feb 10, 2022 at 3:51 PM Josh Gavant @.***> wrote:

@adrianludwin https://github.com/adrianludwin

doc outline here: https://docs.google.com/document/d/192aPEDsoJ-DWsy1GYvmQt_7tKP5wXh9MN9totE81Dx4

Can you please make that public? Thanks!

— Reply to this email directly, view it on GitHub https://github.com/cncf/tag-app-delivery/issues/193#issuecomment-1035501828, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE43PZG6Y5X7FKIP3R4BBOLU2QQL5ANCNFSM5NLTJH4A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

adrianludwin avatar Feb 10 '22 21:02 adrianludwin

I would also invite you to attend the cooperative delivery wg and/or contribute to the whitepaper from that group.

@AlexsJones Will be happy to contribute to the whitepaper. Is there a working document available?

devdattakulkarni avatar Feb 11 '22 22:02 devdattakulkarni

I would also invite you to attend the cooperative delivery wg and/or contribute to the whitepaper from that group.

@AlexsJones Will be happy to contribute to the whitepaper. Is there a working document available?

There aren't at the moment. We will send out a call for contributors in the very near future, and then have a walkthrough of all the practicals in the next Cooperative Delivery WG meeting.

roberthstrand avatar Feb 12 '22 09:02 roberthstrand

Hi folks, we discussed how to move forward with this work in today's WG Coop Delivery meeting. In short, we will meet on May 25 @ 11am US Eastern and hammer out this outline based on comments here and our own insights.

Initial doc started here, everyone is a Commenter, LMK if you want to be an Editor: https://docs.google.com/document/d/1RbXoJ7WBTa_TrGxUgUEQt5BdftwUBoIPeBwilmAlFLg/

Following are the notes from today's WG conversation:

  • Compare WIP from K8s SIG Multi-tenancy
  • How would we add value to that work?
    • other infrastructure besides Kubernetes
    • more info on data plane isolation
    • whitepaper will be opinionated, doc is not
  • What about multitenancy for other infrastructure services?
    • E.g. pipeline runners, database management systems, identity
    • Running a multitenant service - a) cloud providers, b) enterprises
    • How do operators manage many tenants? (Could be part of whitepaper on operators.)
    • Secrets management and integration of Hashicorp Vault
  • Draft a framework/outline with section headings first, enables contribution
  • Next steps
    • Add these notes to GH issue
    • Develop an outline in dedicated meeting on 5/25

joshgav avatar May 11 '22 16:05 joshgav

Hi folks - Here's an initial draft of a blog post about multitenancy using virtual clusters, to be published by the TAG/CNCF: https://gist.github.com/joshgav/d3cb80c978a93f684d5b1b31ad277bc8. We decided in yesterday's meeting to publish something like this to build credibility for our TAG/WG and attract people to the effort proposed in this issue. Would love your feedback in comments there or here, thanks!

@richburroughs please LMK how to improve the description of vcluster :laughing:.

joshgav avatar May 12 '22 16:05 joshgav

Hi @joshgav :) I left a comment on the Gist.

richburroughs avatar May 12 '22 17:05 richburroughs

FYI, per CNCF support staff we won't be able to publish the blog post on cncf.io till after Kubecon EU and after the proposed meeting on May 25 :(. Let's revisit with this in mind in our proposed sync on 5/25.

In the meantime I published the current draft on my personal blog here: https://joshgav.github.io/2022/05/16/cluster-level-multitenancy.html

joshgav avatar May 16 '22 19:05 joshgav

I captured the following screenshot at https://gitopsconeu22.sched.com/event/zrqf/creating-a-landlord-for-multi-tenant-k8s-using-flux-gatekeeper-helm-and-friends-michael-irwin-docker

2022-05-17 13 40 25

I found this image to be a nice overview of the various level of multitenancy. It shows that one could write one article about each level.

I wrote Michael and asked for his slides and if we could reuse his image. I guess the recording will be available at a later stage.

nagyv avatar May 22 '22 17:05 nagyv

Oh interesting. I wonder if he was talking about the cluster-api-nested project because I don't think that image really fits vcluster. They just run in a namespace on the host cluster and they're very cheap, it's just a couple of small pods. I don't think it's the most secure either. It's more like control plane federation for the tenants, probably somewhere in the middle there.

Rich

On Sun, May 22, 2022 at 7:36 PM Viktor Nagy @.***> wrote:

I captured the following screenshot at https://gitopsconeu22.sched.com/event/zrqf/creating-a-landlord-for-multi-tenant-k8s-using-flux-gatekeeper-helm-and-friends-michael-irwin-docker

[image: 2022-05-17 13 40 25] https://user-images.githubusercontent.com/126671/169707817-2e16ff45-330d-4659-ab49-f822919b345c.jpg

I found this image to be a nice overview of the various level of multitenancy. It shows that one could write one article about each level.

I wrote Michael and asked for his slides and if we could reuse his image. I guess the recording will be available at a later stage.

— Reply to this email directly, view it on GitHub https://github.com/cncf/tag-app-delivery/issues/193#issuecomment-1133940581, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPFHWEEXJ6TJJGSZHATTGDVLJWBVANCNFSM5NLTJH4A . You are receiving this because you were mentioned.Message ID: @.***>

richburroughs avatar May 22 '22 20:05 richburroughs

Oh interesting. I wonder if he was talking about the cluster-api-nested project because I don't think that image really fits vcluster. They just run in a namespace on the host cluster and they're very cheap, it's just a couple of small pods. I don't think it's the most secure either. It's more like control plane federation for the tenants, probably somewhere in the middle there. Rich

Michael talked about a custom multi-tenancy setup he architected with his team at Virginia Tech. Here's the recording of his talk. He specifically mentions that the cluster-per-tenant model didn't exactly fit their needs and somewhere in the back of my mind I recall him dropping the term vcluster somewhere but not sure, anymore.

makkes avatar May 23 '22 08:05 makkes

Insightful paper on the architecture of CAPN and perhaps vcluster: https://github.com/kubernetes-sigs/cluster-api-provider-nested/blob/main/virtualcluster/doc/vc-icdcs.pdf

Hat tip to @fei-guo

joshgav avatar May 25 '22 12:05 joshgav

Here is the PR for Kubernetes multi-tenancy, thanks to JimBugwadia https://github.com/kubernetes/website/pull/33934

devdattakulkarni avatar May 25 '22 13:05 devdattakulkarni

If the CNCF approves an exception, we can publish a version of https://joshgav.github.io/2022/05/16/cluster-level-multitenancy.html on https://kubernetes.io/blog/

Normally, we only publish blog article content that has not been published elsewhere.

sftim avatar May 25 '22 15:05 sftim

Apologies folks, I meant to attend the meeting today but I'm super jet lagged from KubeCon.

Let me know if there's anything else you need from me around vcluster :) I'm out Thursday-Monday but I'll be back on 5/31.

Rich

On Wed, May 25, 2022 at 8:19 AM Tim Bannister @.***> wrote:

If the CNCF approves an exception, we can publish a version of https://joshgav.github.io/2022/05/16/cluster-level-multitenancy.html on https://kubernetes.io/blog/

Normally, we only publish blog article content that has not been published elsewhere.

— Reply to this email directly, view it on GitHub https://github.com/cncf/tag-app-delivery/issues/193#issuecomment-1137429029, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPFHWC4APFACNXBQSNVWDLVLZAJVANCNFSM5NLTJH4A . You are receiving this because you were mentioned.Message ID: @.***>

richburroughs avatar May 25 '22 17:05 richburroughs

If the CNCF approves an exception, we can publish a version of https://joshgav.github.io/2022/05/16/cluster-level-multitenancy.html on https://kubernetes.io/blog/

Thanks for the offer @sftim! In the last sync on this topic @AlexsJones and team decided we should focus on multitenancy for services other than K8s clusters. With that in mind, I'll defer to the team on whether we want to share my post more broadly.

joshgav avatar Jun 01 '22 15:06 joshgav