cluster-api-provider-azure icon indicating copy to clipboard operation
cluster-api-provider-azure copied to clipboard

feat: respect externally managed annotation on unmanaged MachinePools

Open mweibel opened this issue 3 years ago • 10 comments

What type of PR is this? /kind feature

What this PR does / why we need it: This adds support for marking MachinePool resources as externally managed. Similar to #2444 but for unmanaged clusters. It does not rely on a property in AzureMachinePool since, unlike in AzureManagedMachinePool`, this property does not exist.

Special notes for your reviewer: Please note that this is just a Draft PR in order to get some feedback on the approach. I reused code and methodology from #2444 in the hope that it matches the existing code style. I did not yet add tests since I just added the code and I'm trying it on a local cluster.

TODOs:

  • [ ] squashed commits
  • [ ] includes documentation
  • [ ] adds unit tests

Release note:

adds the ability to annotate a MachinePool with `cluster.x-k8s.io/replicas-managed-by-autoscaler` to synchronize VMSS capacity with MachinePool replicas automatically.

mweibel avatar Aug 22 '22 12:08 mweibel

Hi @mweibel. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Aug 22 '22 12:08 k8s-ci-robot

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign jackfrancis for approval by writing /assign @jackfrancis in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Aug 22 '22 12:08 k8s-ci-robot

https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2444 got merged so you should be able to reuse the code from it now if you rebase

See also https://github.com/kubernetes-sigs/cluster-api/pull/7107 which is relevant / in progress

CecileRobertMichon avatar Aug 23 '22 23:08 CecileRobertMichon

/ok-to-test

CecileRobertMichon avatar Aug 23 '22 23:08 CecileRobertMichon

See also https://github.com/kubernetes-sigs/cluster-api/pull/7107 which is relevant / in progress

yes I saw that, thanks. Once that PR is merged and released, we can adjust CAPZ code to use the const from CAPI.

mweibel avatar Aug 25 '22 06:08 mweibel

I tested using a separate empty commit and the tests ran through. I squashed them together and now there's again a conflict in the e2e test.

What can I do about this?

mweibel avatar Aug 25 '22 14:08 mweibel

/retest

the failure above looks like a flake

CecileRobertMichon avatar Aug 25 '22 16:08 CecileRobertMichon

/cc @jackfrancis

CecileRobertMichon avatar Aug 25 '22 17:08 CecileRobertMichon

@mweibel this is an interesting use-case. So the idea is that you might want to use cluster-autoscaler on your cluster, but not leverage the Cluster API provider, and instead use the Azure provider (which updates replicas directly via the Azure VMSS API)?

Or are there other use cases you're thinking of?

I'm wondering if there's a better, general workflow to indicate this rather than requiring the user to manually add a well-known annotation to the corresponding capi MachinePool resource?

jackfrancis avatar Sep 01 '22 17:09 jackfrancis

@mweibel: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-azure-e2e-exp 46e860c52838bd5aebc6548ebac44d4d1061b733 link false /test pull-cluster-api-provider-azure-e2e-exp

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot avatar Sep 20 '22 09:09 k8s-ci-robot

/lifecyle active

CecileRobertMichon avatar Nov 08 '22 22:11 CecileRobertMichon

@mweibel this is an interesting use-case. So the idea is that you might want to use cluster-autoscaler on your cluster, but not leverage the Cluster API provider, and instead use the Azure provider (which updates replicas directly via the Azure VMSS API)?

yes exactly. I contributed the same to CAPA already. The use case we have currently is that the cluster-api autoscaler does not yet have enough functionality to replace the native azure one. Crucial features like taints/labels and separate cool down times etc. are not available yet and will probably not for a little while (especially the taints since that's onlyin a proposal how to manage it in CAPI itself).

I'm wondering if there's a better, general workflow to indicate this rather than requiring the user to manually add a well-known annotation to the corresponding capi MachinePool resource?

I settled for an annotation since there was discussion on CAPA about this (annotations vs. a separate field) and they preferred an annotation. It also kind of fits together with the changes in kubernetes-sigs/cluster-api#7107.

mweibel avatar Nov 09 '22 11:11 mweibel

/milestone v1.7

CecileRobertMichon avatar Dec 01 '22 16:12 CecileRobertMichon

/assign

jackfrancis avatar Dec 01 '22 17:12 jackfrancis

/assign @dthorsen

jackfrancis avatar Dec 01 '22 17:12 jackfrancis

@mweibel: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-azure-e2e-exp 46e860c52838bd5aebc6548ebac44d4d1061b733 link false /test pull-cluster-api-provider-azure-e2e-exp

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot avatar Dec 02 '22 10:12 k8s-ci-robot

@mweibel this lgtm, I think we need to rebase this PR on top of latest main to overcome the E2E flake

Thanks again for this, let's try to land this before mid-next week so it can be included in the 1.7. release!

jackfrancis avatar Jan 06 '23 17:01 jackfrancis

I think we need to rebase this PR on top of latest main to overcome the E2E flake

Prow does the rebase when running tests

/retest

CecileRobertMichon avatar Jan 06 '23 18:01 CecileRobertMichon

as discussed in https://kubernetes.slack.com/archives/CEX9HENG7/p1673034488923059, let's merge this and refactor to use the CAPI helper function in a separate PR: https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/2993

/lgtm /approve

CecileRobertMichon avatar Jan 07 '23 00:01 CecileRobertMichon

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CecileRobertMichon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • ~~OWNERS~~ [CecileRobertMichon]

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Jan 07 '23 00:01 k8s-ci-robot