machine-controller-manager icon indicating copy to clipboard operation
machine-controller-manager copied to clipboard

[GEP-28] Support managing machines without target cluster

Open timebertt opened this issue 6 months ago • 2 comments

How to categorize this issue?

/area ipcei /kind enhancement

What would you like to be added:

If machine-controller-manager or the machine-controller-manager-provider is started with --target-kubeconfig=none, it should disable all interactions with the target cluster.

This should allow the creation of machines in the infrastructure without an existing target cluster. Notable differences from the standard machine creation flow are:

  • mcm doesn't create bootstrap token secrets
  • mcm doesn't set the node label on the Machine object
  • mcm doesn't wait for the Node object to exist in the target cluster
  • mcm sets the Machine object to Available after creating it in the infrastructure (target state)

When deleting machines without a target cluster:

  • mcm doesn't drain or delete the Node in the target cluster
  • mcm doesn't delete or wait for VolumeAttachments in the target cluster

I propose using --target-kubeconfig=none instead of --target-kubeconfig="" (or omitting the flag), as this currently implies using the in-cluster config. We can't use --target-kubeconfig="" for this purpose without a breaking change.

Why is this needed:

GEP-28 (https://github.com/gardener/gardener/issues/2906) includes the medium-touch scenario, where the control plane machines of the autonomous shoot cluster are managed using the provider extension and machine-controller-manager (gardenadm bootstrap). The main idea is to reuse existing components for managing the infrastructure resources (network resources and machines), and then gardenadm init will bootstrap the cluster's control plane when executed on the prepared machines.

In contrast to normal shoot clusters (with a control plane hosted on a seed cluster), there is no control plane running when creating machines (with gardenadm bootstrap) that will be used as control plane nodes. To support this scenario, machine-controller-manager should allow disabling all interactions with the target cluster.

For successfully running gardenadm bootstrap, supporting the deletion of machines without a target cluster might not be needed. However, the automatic deletion of machines will be helpful in error cases, e.g., when misconfiguring the machines and during development.

timebertt avatar May 16 '25 12:05 timebertt

/assign

timebertt avatar May 16 '25 12:05 timebertt

Starting to implement this :)

timebertt avatar Jun 10 '25 14:06 timebertt