vcluster Operator support

Is your feature request related to a problem?

We wanted to quickly create new vcluster without using the official CLI tool since we want to do declarative configuration using Kubernetes itself as a centralized version controller rather than putting our values.yml externally and rely on an external agent (say like CI/CD automation) to "reconcile"

Which solution do you suggest?

Make an official operator

Which alternative solutions exist?

Someone already made something similar: https://github.com/zachomedia/vcluster-operator But the last update is on 2021, the state is clearly stale right now

Additional context

We want to use CRD instead of helm/vcluster CLI because we are building a tool to create geo-based clusters and run Ceph on it and recursively run vcluster inside it for KubeVirt again. This way we can run a global Kubernetes instance to (although it is very much not recommended) share load balancing, cluster management and monitoring under one network with strong BGP guarantee (we use Calico for the CNI backbone). We can further subdivide the vcluster-in-vcluster to create "availability group" then we can further subdivide the geolocation for datacenters closer to each other (which is another level of recursive again I know it sounds confusing). For example we have the following hierarchy:

World -> Asia -> East Asia -> Japan -> TokyoDC1/TokyoDC2 World -> Asia -> East Asia -> Japan -> OsakaDC1/OsakaDC2 World -> Asia -> East Asia -> Hong Kong -> ShatinDC1/ShatinDC2 World -> Asia -> East Asia -> Hong Kong -> KwaiTungDC1/KwaiTungDC2 World -> Asia -> Southeast Asia -> Singapore -> SingtelDC1/SingtelDC2 World -> America -> North America -> Canada -> MontrealDC1/MontrealDC2 World -> America -> North America -> United States -> CaliforniaDC1/CaliforniaDC2

Where the leaves are the actual Kubernetes node connected by Wireguard.

So we will create a world Kubernetes cluster first, combining all of the nodes we controlled regardlessly. All under one subnet.

Then we create two vcluser for Asia and America using node selector respectively, let's call them continental clusters.

For each continental cluster, we create another vcluster again based on the hierarchy. For example, once WorldAsia cluster is created we spawn two vcluster for East Asia and Southeast Asia and use node selector again to further subdivide...Do this until the leaves...you know what I mean.

From the penultimate level we start to deploy Level 1 Ceph (because its geolocation is close enough to have nearline replication), and the ultimate level/leaves will also have Level 2 Ceph as well, each Level 2 Ceph operates on its own (since it is DC level already)

I know this sounds very much like Loft but it is more traditionally since it can also run geo-replicated, fault-tolerant VM and multi-vcluster Kubernetes to run services and development environments. So it's more like Loft, KubeVirt, Proxmox, Ceph and Calico had an orgy together and produced a monster.

The benefit of this is we can finally have a stable BGP network without hassle, and if we want to do cross-regional RPC calls it is also easy as pie. Just recursively expose the services and we are good to go. No need for fancy API Gateway and shit. It also helps simplify Ceph deployment at the cost of more CPU and memory but it is much better than having VMs. We can still have the option to run VM though because we included KubeVirt to run legacy Windows services as well.

However, the missing puzzle here is of course vcluster. vcluster current do not provide any means of deploying using CRDs (while the rest of our current components are), and the closest thing we can get is helm with server-side apply, or we directly produce the manifest needed and do server-side apply as well. This however is not going to work well in our recursive cluster deployment model, and one of the more important thing about using an operator is state reconciliation, to keep the resource healthy and reproducible. Helm alone can only support deployment but not reconciliation

Dec 08 '22 09:12 wizpresso-steve-cy-fan

I don't think there are plans to create a vcluster operator right now. We already have the Cluster API provider vcluster (Docs) for declarative vcluster instance management, but it is limited to one host cluster. Our commercial product Loft can manage vclusters via CRDs as well, and it can do that from a central API and across many host clusters, so that might be a good fit for your use case actually. Loft does not address multi-cluster deployments use cases, but it addresses vcluster management and user access/management pretty well, and don't hesitate to book a demo or ask us questions(e.g. in Slack) to find out more.

I have to say, this is probably the most interesting and ambitious vcluster use case I've seen! Definitely let us know if you write a blog post about it or do some talk, or if you just want to share progress on how the setup is going(btw, do you already have a PoC?).

Dec 08 '22 09:12 matskiv

@matskiv I use vcluster since I don't need Loft at all. I have my own identity provider for that. vcluster is minimal enough for me. Also Loft is a commercial while we want to keep it as open source as possible. No go for Loft, sorry, but very close to.

Dec 08 '22 09:12 wizpresso-steve-cy-fan

@wizpresso-steve-cy-fan would you be able to use a helm-operator as a workaround, I wonder? Meaning using the vcluster helm chart and basically using the generic HelmRelease CR with the relevant values instead of an explicit Cluster CR?

Mar 25 '23 11:03 redradrat

@redradrat I tried but it is not perfect, especially if the helm upgrade has some errors it will fail to reconcile, and it will spam upgrade rollback again and again...so it is a no go.

Dec 14 '23 03:12 wizpresso-steve-cy-fan

Just chiming in to say I'd also be interested in this. I had actually made the assumption that an operator was the way it worked, and was surprised to find that the helm chart set up an individual cluster and not an operator. My use case is simply wanting to easily deploy multiple vclusters on my one host cluster using Flux CD, which the Helm method works for, it just feels clunky. I'd much rather deploy an operator and then just need have a CR representing each vcluster.

May 08 '24 20:05 virtualdxs