cluster-api Resources representing MachinePool Machines

Resources representing MachinePool Machines

Open devigned opened this issue 3 years ago • 17 comments

User Story

As a user I would like to be able to see individual MachinePool machine resources and be able to manipulate them.

Detailed Description

For example, as a user I would like to be able to delete a specific instance, machine, in a machine pool. Currently, the only way to remove a machine from a machine pool is to decrease the replica count and hope the machine is deleted.

For example, as a user I would like to be able to see the status of a specific machine in a machine pool. What's the running state of the machine. What conditions are set on the a machine?

This sounds like a relatively simple request, but quickly explodes into a larger MachinePool design conversation.

Call all MachinePool providers be expected to delete a machine individually?
- If an instance can be deleted what guarantees are there enough machines in a region or zone to still achieve high-availability. Who is responsible for zonal balancing, the infrastructure service (auto scale groups, virtual machine scale sets, etc) or CAPI?
- Do we need to know about pod disruption budgets?
- Does each provider need to know how to cordon and drain machine pool nodes, or is there a common facility in CAPI?
Does a Machine need to be more flexible about how it references the infrastructure provider? Should machine instead reference a corev1.Node and use that as the level of abstraction?
Does this have an impact on machine health checks?

/kind feature

Jan 11 '21 16:01 devigned

From the conversation we had on Friday there were a few ideas on the table, in this post I'm going to express the potential proposal I've made during the meeting to reuse the current Machine object to represent a MachinePool's Machine.

Machine controller

The Machine controller today relies on an infrastructureRef to understand when it's in Ready state, and what's the ProviderID it should work against. A Machine is intended to become a Kubernetes Node.

The controller, after a status.NodeRef has been assigned, operates on the Node itself, and does so by potentially adding labels to it, reconciling conditions coming from the node.

Upon deletion, the Machine controller deletes all linked resources first (infrastructureRef, bootstrapRef, etc), then proceed to cordon and drain the node before removing its finalizer. We also support pre-drain and pre-delete hooks, where folks can add annotations to delay the deletion of a Machine.

Machine in a MachinePool

Using Machine objects in a MachinePool has a few benefits:

Reuse all the existing code (which has been tested in production for a while) and behaviors listed above.
Add the ability to delete a single Machine from a MachinePool.
Adapt Cluster Autoscaler to support MachinePool by reusing the same code paths that today are being used with MachineDeployments.
Allow MachineHealthCheck to be used with MachinePool's Machines.

On the flip side:

MachinePool infrastructure controllers must now understand how to delete a single Machine from a group, if they cannot do it, they shouldn't opt in into creating Machines which can be counter intuitive. For this point, we should seek more information on the current state of MachinePool infrastructure implementations out there, and draw some conclusions.
When deleting a single machine, we won't rely on a scale down from the scale-set resource from a cloud provider.

Let's explore this point a bit more:
- If the replicas count hasn't changed
  - And a Machine is deleted
  - The MachinePool backing implementation (usually cloud provider scale-set) should detect that there are fewer than available replicas, and scale back up
  - If the Machine that has been deleted caused a misbalance, we can probably safely assume that the scale-set cloud resource is going to place a newly added replica in the right place.
- If the replicas count is being decreased
  - And a Machine is deleted
  - The MachinePool backing implementation (usually cloud provider scale-set) shouldn't proceed with any other operations, keeping the count stable in this case.

Alternatives considered

Some folks pointed out that we could build a controller (which could run in the workload cluster) that watches corev1.Node objects, and move most of the logic above to it.

While this is definitely something to explore, I'm a bit more comfortable to use the existing and proven resources and controllers, instead of building new ones from scratch. A new deployment that needs to run in the workload cluster directly might bring more complications to the overall systems, in terms of version management and lockstep behaviors.

Mostly a core-dump, hopefully this helps 😄

Jan 11 '21 17:01 vincepri

/area api /priority important-longterm /milestone Next (hopefully we can address this in v0.4.0, although setting Next for now until we have some consensus)

Jan 11 '21 17:01 vincepri

In our current clusters, we are using cluster-autoscaler with the AWS provider in combination with MachinePools. This greatly simplifies the code in cluster-api and the infra provider since all the instance management logic is effectively delegated to the cloud provider and cluster-autoscaler. When we wish to delete an instance from an autoscaling group, we simply drain the node. cluster-autoscaler detects that the node is empty, and it automatically deletes that instance roughly 10 minutes later. The use case I originally saw for MachinePools, and their distinction from MachineDeployments is that they delegate much of this logic to the cloud provider. If we make the MachinePools logic so similar to MachineDeployments as is proposed above, I wonder why have MachinePools at all, and we could just simply use MachineDeployments.

Jan 21 '21 16:01 dthorsen

I think @dthorsen makes a really great point and something we should not forget while discussing these changes. The main goal of MachinePools is to allow delegating functionality to the cloud providers directly (zones, scaling, etc.). If we re-implement scale down/draining/etc. via Machines for MachinePools, we lose this completely. Perhaps a better solution is to have cloud specific autoscaler (and k8s upgrade) implementations for MachinePools since the idea is to stay as close to the infra as possible.

Feb 01 '21 21:02 CecileRobertMichon

@CecileRobertMichon @devigned @h0tbird To this end, we have added an externallyManagedReplicaCount field to the MachinePool spec and we are currently testing this in our development environments. Infrastructure providers could use this field to determine which mode they are running in, CAPI controlled replicas or externally-controlled replicas. We will PR this soon.

Mar 17 '21 14:03 dthorsen

@dthorsen The logic explained above makes sense, although I'd ask for a small amendment to the MachinePool proposal that describes the field in more details.

Mar 17 '21 15:03 vincepri

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

Jun 15 '21 15:06 fejta-bot

/remove-lifecycle stale

Jun 15 '21 15:06 vincepri

/assign @devigned @dthorsen

Jun 16 '21 17:06 CecileRobertMichon

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Sep 14 '21 17:09 k8s-triage-robot

/lifecycle active

Sep 14 '21 17:09 CecileRobertMichon

/assign @mboersma

Sep 14 '21 17:09 devigned

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Dec 13 '21 17:12 k8s-triage-robot

/remove-lifecycle stale

Dec 13 '21 18:12 mboersma

/milestone v1.2 given that there is work going on for this issue

Feb 11 '22 18:02 fabriziopandini

/lifecycle active

Mar 29 '22 21:03 CecileRobertMichon

/triage accepted

Aug 08 '22 16:08 CecileRobertMichon

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Nov 06 '22 16:11 k8s-triage-robot

/remove-lifecycle stale

Nov 07 '22 16:11 mboersma

/lifecycle frozen /help

Nov 07 '22 17:11 fabriziopandini

@fabriziopandini: This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

Why are we solving this issue?
To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
Does this issue have zero to low barrier of entry?
How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to this:

/lifecycle frozen /help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Nov 07 '22 17:11 k8s-ci-robot

Removing it from the 1.2 milestone as it wasn't implemented in the 1.2 time frame

Nov 30 '22 17:11 sbueringer

I think we should consider follow-up tasks now that the first iteration is merged (e.g. MHC support, in-line label propagation, ...)

Jul 11 '23 14:07 sbueringer

/reopen

Jul 11 '23 14:07 sbueringer

@sbueringer: Reopened this issue.

In response to this:

/reopen

Jul 11 '23 14:07 k8s-ci-robot

@sbueringer - can we close this issue out once #8842 is merged or close it now? And add new issue(s) based on more details of what else you'd like to see enhanced/added? The issue is very old and we have delivered on the core of the ask originally IMO.

Jul 17 '23 16:07 dtzar

Sure, fine for me.

It would probably make sense to create an umbrella issue with an overview over the next iteration

Jul 17 '23 17:07 sbueringer

Ok, I will look to you for this umbrella issue. Related to #9005

Jul 17 '23 20:07 dtzar

It will probably take considerable time until I find the time to create an umbrella issue for MachinePool Machines (especially to do the necessary research to figure out which parts are missing between implementation / the proposal / and to figure out what a MachinePool Machine at the moment doesn't support compared to a "normal" (KCP/MD/MS) Machine. Limited bandwith right now.

But feel free to close this issue, I didn't want to block. Just thought we wanted to do some follow-ups based on the discussions we had on the MachinePool Machine PR.

Jul 18 '23 14:07 sbueringer

Sounds good, thanks @sbueringer. Jonathan has it set to close automatically when #8842 merges.

Jul 18 '23 19:07 dtzar

cluster-api cluster-api copied to clipboard

Resources representing MachinePool Machines

Machine controller

Machine in a MachinePool

Alternatives considered

Guidelines

cluster-api
cluster-api copied to clipboard