autoscaler icon indicating copy to clipboard operation
autoscaler copied to clipboard

Use cloud-provider-aws instead of legacy-cloud-providers/aws

Open olemarkus opened this issue 3 years ago • 3 comments

Which component this PR applies to?

cluster autoscaler

What type of PR is this?

/kind bug

What this PR does / why we need it:

The AWS cloud provider's cloud config has diverged from legacy-cloud-provider. The former has more configuration options that legacy cloud provider does not know about. Parsing the cloud config will fail of any of these additional fields are used.

Switching to using the AWS provider from cloud-provider-aws solves this issue.

Does this PR introduce a user-facing change?

NONE

olemarkus avatar Aug 28 '22 08:08 olemarkus

+AWS owners @jaypipes @gjtempleton @drmorr0

feiskyer avatar Aug 29 '22 00:08 feiskyer

Could I get some feedback on this one please?

olemarkus avatar Sep 21 '22 05:09 olemarkus

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jaypipes, olemarkus Once this PR has been reviewed and has the lgtm label, please assign aleksandra-malinowska for approval by writing /assign @aleksandra-malinowska in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Sep 23 '22 17:09 k8s-ci-robot

Could we get this merged now?

olemarkus avatar Oct 21 '22 11:10 olemarkus

Hey @olemarkus, sorry, previously missed this.

As we can see with this PR suddenly having conflicts due to the updated vendoring of the k/k 1.26-rc.1 dependencies (#5336) we have a strong preference for sticking to the upstream dependencies at the top level of the repo, and vendoring cloud provider specific dependencies/drifts from these dependencies under the relevant cloud-provider.

Barring a strong reason to deviate from that I'd suggest that's the pattern we should follow here.

gjtempleton avatar Dec 03 '22 15:12 gjtempleton

Hey. I am a bit unsure what you mean here. cloud-provider-legacy is ... legacy. And the specific CCMs should be considered the upstream. The OP already describes a bug that comes from using the unmaintained legacy CCM implementations vs the actually maintained ones.

olemarkus avatar Dec 03 '22 19:12 olemarkus

Sorry, to add a bit more context, the CA includes the entire scheduler code to perform scheduling simulation inside the CA, this means we currently include its entire dependencies, and update the repo's go.mod to the upstream using this script when updating the CA to support new versions of k8s.

This means pulling in new dependencies which aren't part of k/k's dependencies can result in significant version conflicts the next time we have to update the vendored deps. Therefore, our approach recently has been that any dependencies which aren't brought in with k/k should be vendored under the cloud provider making use of them. See the AWS providers' vendoring of the go-aws-sdk to allow use of a newer version than that used by k/k for an example.

It does look like only the AWS and GCE providers make use of the legacy-code-providers, so if we can move both to the new ones I'd think the dependencies we pull in at the top level might be able to shrink even further.

gjtempleton avatar Dec 05 '22 12:12 gjtempleton

Ah. That makes a lot more sense. But I have concerns:

This can get interesting, with the CCMs having its own set of interesting dependencies: CCM depends on specific AWS SDK versions and a host of k8s.io libs on its own. So I am not sure this necessarily improves the situation. The CCM code itself probably has to be modified to use the cloud provider's AWS SDK version too. It may or may not work with the top-level/cloud provider-specific version.

I haven't looked too closely at what exactly CA need from CCM, but it might be that it is simpler to copy over the functionality instead of vendoring (i.e copy functions/structs, not entire files).

olemarkus avatar Dec 05 '22 12:12 olemarkus

Aha, yeah, that does sound like a real pain to potentially untangle.

From a quick look through, AWS and GCE are the only two cloudproviders importing the legacy-cloud-providers, though I've not dug deeply into whether they're using the same areas of functionality yet, so agreed that just copying over the required functionality may be a simpler solution.

@MaciekPytel do you have any thoughts?

gjtempleton avatar Dec 11 '22 17:12 gjtempleton

/assign @gjtempleton

(Since you're already reviewing this.)

FWIW, I think the right way to handle deps in the long term is https://github.com/kubernetes/autoscaler/issues/5394

x13n avatar Jan 09 '23 11:01 x13n

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 09 '23 12:04 k8s-triage-robot

/remove-lifecycle stale

Shubham82 avatar Apr 12 '23 11:04 Shubham82

@olemarkus do you think we can expedite making this PR mergeable. This is currently blocking release of CA v1.27 https://github.com/kubernetes/autoscaler/issues/5625

jayantjain93 avatar Apr 12 '23 11:04 jayantjain93

CLA Signed

The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: olemarkus / name: Ole Markus With (9613f30b9382496593b42ba3f32ca7ac6640c52c)

/lgtm

BigDarkClown avatar Apr 17 '23 11:04 BigDarkClown

Thanks. /lgtm /approve

gjtempleton avatar Apr 18 '23 14:04 gjtempleton

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gjtempleton, jaypipes, olemarkus

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Apr 18 '23 14:04 k8s-ci-robot