cluster-api icon indicating copy to clipboard operation
cluster-api copied to clipboard

Allow to move part of all managed clusters to another management cluster

Open omniproc opened this issue 2 years ago • 3 comments

User Story

As a CAPI operator I would like to move part of the workload clusters managed by a single CAPI management cluster (source) to another CAPI management cluster (target) so different workload clusters can be moved to different management instances and their lifecycle can be decoupled later on.

Detailed Description Currently clusterctl move will move all ClusterAPI resources from one management cluster to another. However there are situations where you only want to migrate some of the managed resources. E.g. if a CAPI management cluster does manage two workload clusters, A and B, and at some point in their lifecycle you decide to split them up to be managed by different CAPI management clusters. There are multiple reasons why you might need that. Tenant / network separation, version dependencies (CAPI / infrastructure provider versions drifted for the managed workload clusters) and so on.

Introduce a new argument --clusters to the move command of clusterctl that takes a list of managed clusters, with the default of * (any) for backward compatibility, that should be moved from the source CAPI management cluster to the target CAPI management cluster.

Not sure if the operation should be considered atomic (only succeed if all clusters in the list could be moved) or not. I guess that depends on how handling of partly failed resources during move is currently handled.

/kind feature

omniproc avatar Aug 15 '22 15:08 omniproc

Interesting use case!

/area clusterctl

killianmuldoon avatar Aug 15 '22 15:08 killianmuldoon

Just in case it helps. move has a --namespace parameter which only moves Clusters of a specific namespace.

The main difficulty in moving an individual cluster will be how to handle objects which are used by multiple Clusters, e.g. ClusterClass.

Not sure if we already have the problem today with Cluster-wide infra-providers specific resources when only Clusters of a namespace are moved (not sure if a cluster-wide infra-provider specific resource like this exists, maybe AWSClusterControllerIdentity?)

sbueringer avatar Aug 15 '22 16:08 sbueringer

/triage accepted

The main difficulty in moving an individual cluster will be how to handle objects which are used by multiple Clusters, e.g. ClusterClass.

true, but the problem is even more complex because there could also be objects not related to any clusters (other root-hierarchies) or objects that we force to move.

Not sure if we already have the problem today with Cluster-wide infra-providers specific resources when only Clusters of a namespace are moved (not sure if a cluster-wide infra-provider specific resource like this exists, maybe AWSClusterControllerIdentity?)

AFAIK we don't have any problem today because move has been designed and it is used mostly for the bootstrap/pivot use case, when there is only one cluster and one namespace.

Any previous attempt to enhance clusterctl move to properly cover other use cases resolved in small improvements but not really in a change of the original scope of the command (see e.g. https://github.com/kubernetes-sigs/cluster-api/issues/3354); in other words, even if the internal implementation of move is pretty generic and flexible, in practice there is only a use case that it properly tested and verified.

fabriziopandini avatar Aug 26 '22 12:08 fabriziopandini

This is an interesting use case.

Handling resources that are shared across multiple clusters is definitely an interesting problem for this use case. CRS is another example where resources are shared across clusters. The same CRS resource could be bound to multiple clusters.

ykakarap avatar Aug 29 '22 22:08 ykakarap

I also need this feature. It would be good to add it also in the backup command to be able to backup only one workload cluster.

mel1nn avatar Nov 22 '22 16:11 mel1nn

It would be good to add it also in the backup command to be able to backup only one workload cluster.

Rif. https://github.com/kubernetes-sigs/cluster-api/issues/6992, -1 from my side (reasons are discussed in the issue)

fabriziopandini avatar Nov 22 '22 19:11 fabriziopandini

This would become handy now, i need to migrate workload clusters to a new management cluster and i am afraid of doing all at once.

project0 avatar Jan 05 '23 09:01 project0

@project0 does the --namespace arg for clusterctl move help you, or are all of your workload clusters in the same namespace?

killianmuldoon avatar Jan 05 '23 09:01 killianmuldoon

@killianmuldoon It is all in the same namespace as it is easier for us to maintain with flux in between :-(

project0 avatar Jan 05 '23 10:01 project0

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-triage-robot avatar Jan 20 '24 03:01 k8s-triage-robot

FYI https://github.com/kubernetes-sigs/cluster-api/issues/9705 describes a similar use case + expresses similar concerns about the requirements for moving objects shared across clusters

/triage needs-discussion

fabriziopandini avatar Feb 06 '24 19:02 fabriziopandini

/priority backlog

fabriziopandini avatar Apr 12 '24 13:04 fabriziopandini

This issue is currently awaiting triage.

CAPI contributors will take a look as soon as possible, apply one of the triage/* labels and provide further guidance.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Apr 18 '24 13:04 k8s-ci-robot

The Cluster API project currently lacks enough active contributors to adequately respond to all issues and PRs.

Also the issue doesn't have updates since Jan 2023 and we never reached an agreement on how to hadle objects shared between many clusters /close

fabriziopandini avatar Apr 30 '24 12:04 fabriziopandini

@fabriziopandini: Closing this issue.

In response to this:

The Cluster API project currently lacks enough active contributors to adequately respond to all issues and PRs.

Also the issue doesn't have updates since Jan 2023 and we never reached an agreement on how to hadle objects shared between many clusters /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Apr 30 '24 12:04 k8s-ci-robot