liqo icon indicating copy to clipboard operation
liqo copied to clipboard

[Feature] Make it possible to target a Cluster with Multiple Virtual Nodes

Open aleoli opened this issue 1 year ago • 14 comments

Describe the solution you'd like

It should be possible to have multiple virtual nodes targeting the same remote cluster. It could be helpful for many reasons, for instance:

  1. make the other cluster aware of each remote node to have better scheduling and affinities enforcement
  2. share huge resource pools with another cluster while keeping the virtual node size quite small, avoiding a "black hole" effect during scheduling

The current ResourceOffer CRD spec should have two new fields (mutually exclusive to each other) to enable the remote cluster to handle node names.

apiVersion: offloading.liqo.io/v1alpha1
kind: ResourceOffer
metadata:
  namespace: tenant-namespace
spec:
  nodeName: ... # -> the exact node name to be used in the remote cluster, it can be used in case of exact replication of local nodes
  nodeNamePrefix: ... # -> a prefix used by the remote cluster to generate node names; it can be used in case of resource pools

The creation of multiple resource offers will trigger the creation of multiple VritualNode resources and then multiple additional nodes.

new_workflow drawio

Required Steps

  • [x] add fields to the CRD
  • [x] add a new configuration to the ResourceRequest operator to enable the creation of a ResourceOffer for each node
  • [x] modify the resource offer operator to create the new VirtualNode (see #1766) CRD instead of VK Deployments
  • [ ] check and fix the plugin interface to work with multiple offers for the same request

Future Steps

  • the resource offer operator may choose to split a single offer to multiple VirtualNodes to avoid the "black hole" effect

aleoli avatar Apr 06 '23 09:04 aleoli

@aleoli One clarification, only one VK deployment will be created when multiple VirtualNode CRD is created right? All VirtualNode of one remote cluster is handled by one VK deployment?

Thank you for amazing feature.

Sharathmk99 avatar Apr 06 '23 17:04 Sharathmk99

Our one remote cluster has 300’s of nodes, if we start creating VK deployment for every VirtualNode CRD of same remote cluster there will be 300 pods in host cluster for one remote cluster. If we peer 10 remote cluster, there will be 3000 pods. Operation effort is too much, also need to tune API server & etcd. It will be useful to have one VK deployment for all VirtualNode CRD of same remote cluster.

Sharathmk99 avatar Apr 07 '23 06:04 Sharathmk99

Let's go by steps. You have a point, but the VirtualKubelet assumes to be handling only one node, and changing this will be a significant change and refactoring in its code. Let's start allowing multiple nodes for every cluster and then think about consolidating all these "similar" deployments into only one. (this is my personal opinion, open to discussion)

And second, 300 nodes x 10 clusters will stress the API server in any case. On the other hand, do you need to see all these nodes in the central cluster? Can some policy of aggregation of nodes, for instance, one virtual node for each node poll, be helpful?

aleoli avatar Apr 07 '23 07:04 aleoli

I would suggest to go by steps:

  • re-factor the internals of Liqo in order to support a more sophisticated mapping between remote clusters and the number of VK in the "home" cluster
  • provide some "default" behavior (e.g., (a) the entire cluster to a single VK, such as now, (b) each node in the remote cluster to a different VK, which of course may have scalability problems, but looks useful in some use cases)
  • define a way to provide a more granular mapping

This is intended to minimize disruptions while bringing this feature to life asap (incorporating incoming comments in the design).

frisso avatar Apr 08 '23 14:04 frisso

Thank you @aleoli and @frisso. I totally agree with you by going by steps. Just wanted to highlight some scalability problems.

I agree with @aleoli that creating one VK for node pool be helpful as well. Will I get control on how VirtualNode CRD's are created? I see from the diagram VirtualNode CRD's are created from ResourceOffer CRD, will I get control on how ResourceOffer get's created and which node's are included in ResourceOffer CRD?

Sorry for my dummy questions. Thank you!!

Sharathmk99 avatar Apr 08 '23 15:04 Sharathmk99

In the first implementation, some flags should be available to select pre-defined behaviors (i.e., one virtual node, one node per for each physical node, etc.).

Note that a resource plugin interface can be implemented to have the preferred custom behavior. We should check and eventually fix it to work with multiple offers for the same request. I'm adding this step to the main issue steps.

In the future, we should define other plugins in the library to handle the most exciting scenarios and add them to the repository. This way, we should avoid adding many flags to the primary resource offer controller.

aleoli avatar Apr 11 '23 10:04 aleoli

I agree with you. Looking forward to try it out. Thank you!

Sharathmk99 avatar Apr 11 '23 11:04 Sharathmk99

This is much needed feature for use as we have multiple node types like CPU, GPU and memory intensive. Please let us know how can we help to implement. This feature also helps to distribute pods across multiple virtual node as now all pods gets scheduled in one virtual node. Currently one virtual node is handling around 400 pods, we don’t want to face scaling issue so. thank you.

Sharathmk99 avatar Oct 01 '23 21:10 Sharathmk99

Any update on this. Is it possible we can help to implement or any timeline? Thank you

Sharathmk99 avatar Oct 22 '23 17:10 Sharathmk99

Hi @Sharathmk99, at the moment you can create other nodes after peering. You need to use the command liqoctl create virtualnode and specify the kubeconfig secret contained inside the liqo-tenant namespace with flag --kubeconfig-secret-name.

At the moment is not possible to target a subset of remote nodes with a virtual node. It is a feature that we are going to implement after the release of the new network.

cheina97 avatar Oct 23 '23 10:10 cheina97

Just to give you some more info, we're deeply improving the internals of Liqo, with a strong focus on modularity. It's is hard to provide full support for the feature you're asking for with the current architecture, so we implemented part of this feature, but with no documentation (yet). Our plans are to restart our efforts in that direction as soon as this phase has been completed, which means about 2 months (let's say approximately Jan 2024), when the rest of the code will allow an easier and cleaner implementation. Maybe this should be reflected into the public roadmap, isn't it (@cheina97 @aleoli )

frisso avatar Oct 23 '23 12:10 frisso

@aleoli @cheina97 is it possible to consider this feature or we can help to develop this feature.

As our cluster is growing and need to create multiple virtual node targeting to same foreign cluster.

Sharathmk99 avatar Mar 04 '24 23:03 Sharathmk99

Hi @Sharathmk99! We will include full support for this feature in Liqo 0.12; we need to refactor the authentication module to support it fully #2382

However, you can add additional virtual nodes to your peering with Liqo 0.10 by creating additional VirtualNode CR or by using the liqoctl create virtualnode command

aleoli avatar Mar 11 '24 09:03 aleoli

Hi @aleoli, Thank you for the clarification.

With v0.10.0 version can I add virtual node pointing to existing foreign cluster? How can I change resource offer plugin to expose different resource for each virtual node?

Thank you!!

Sharathmk99 avatar Mar 17 '24 11:03 Sharathmk99