autoscaler
autoscaler copied to clipboard
Use cloud-provider-aws instead of legacy-cloud-providers/aws
Which component this PR applies to?
cluster autoscaler
What type of PR is this?
/kind bug
What this PR does / why we need it:
The AWS cloud provider's cloud config has diverged from legacy-cloud-provider. The former has more configuration options that legacy cloud provider does not know about. Parsing the cloud config will fail of any of these additional fields are used.
Switching to using the AWS provider from cloud-provider-aws solves this issue.
Does this PR introduce a user-facing change?
NONE
+AWS owners @jaypipes @gjtempleton @drmorr0
Could I get some feedback on this one please?
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: jaypipes, olemarkus
Once this PR has been reviewed and has the lgtm label, please assign aleksandra-malinowska for approval by writing /assign @aleksandra-malinowska in a comment. For more information see:The Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Could we get this merged now?
Hey @olemarkus, sorry, previously missed this.
As we can see with this PR suddenly having conflicts due to the updated vendoring of the k/k 1.26-rc.1 dependencies (#5336) we have a strong preference for sticking to the upstream dependencies at the top level of the repo, and vendoring cloud provider specific dependencies/drifts from these dependencies under the relevant cloud-provider.
Barring a strong reason to deviate from that I'd suggest that's the pattern we should follow here.
Hey. I am a bit unsure what you mean here. cloud-provider-legacy is ... legacy. And the specific CCMs should be considered the upstream. The OP already describes a bug that comes from using the unmaintained legacy CCM implementations vs the actually maintained ones.
Sorry, to add a bit more context, the CA includes the entire scheduler code to perform scheduling simulation inside the CA, this means we currently include its entire dependencies, and update the repo's go.mod to the upstream using this script when updating the CA to support new versions of k8s.
This means pulling in new dependencies which aren't part of k/k's dependencies can result in significant version conflicts the next time we have to update the vendored deps. Therefore, our approach recently has been that any dependencies which aren't brought in with k/k should be vendored under the cloud provider making use of them. See the AWS providers' vendoring of the go-aws-sdk to allow use of a newer version than that used by k/k for an example.
It does look like only the AWS and GCE providers make use of the legacy-code-providers, so if we can move both to the new ones I'd think the dependencies we pull in at the top level might be able to shrink even further.
Ah. That makes a lot more sense. But I have concerns:
This can get interesting, with the CCMs having its own set of interesting dependencies: CCM depends on specific AWS SDK versions and a host of k8s.io libs on its own. So I am not sure this necessarily improves the situation. The CCM code itself probably has to be modified to use the cloud provider's AWS SDK version too. It may or may not work with the top-level/cloud provider-specific version.
I haven't looked too closely at what exactly CA need from CCM, but it might be that it is simpler to copy over the functionality instead of vendoring (i.e copy functions/structs, not entire files).
Aha, yeah, that does sound like a real pain to potentially untangle.
From a quick look through, AWS and GCE are the only two cloudproviders importing the legacy-cloud-providers, though I've not dug deeply into whether they're using the same areas of functionality yet, so agreed that just copying over the required functionality may be a simpler solution.
@MaciekPytel do you have any thoughts?
/assign @gjtempleton
(Since you're already reviewing this.)
FWIW, I think the right way to handle deps in the long term is https://github.com/kubernetes/autoscaler/issues/5394
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
@olemarkus do you think we can expedite making this PR mergeable. This is currently blocking release of CA v1.27 https://github.com/kubernetes/autoscaler/issues/5625
The committers listed above are authorized under a signed CLA.
- :white_check_mark: login: olemarkus / name: Ole Markus With (9613f30b9382496593b42ba3f32ca7ac6640c52c)
/lgtm
Thanks. /lgtm /approve
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: gjtempleton, jaypipes, olemarkus
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [gjtempleton]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment