autoscaler icon indicating copy to clipboard operation
autoscaler copied to clipboard

Generic API based cluster autoscaler

Open Bessonov opened this issue 4 years ago • 24 comments

There is a high demand to allow custom cloud autoscaler provider:

  • https://github.com/kubernetes/autoscaler/issues/191
  • https://github.com/kubernetes/autoscaler/issues/984
  • https://github.com/kubernetes/autoscaler/issues/1050
  • https://github.com/kubernetes/autoscaler/issues/2053

Well, I'm not able to reopen any of them. This request goes beyond the hard coding every possible provider into autoscaler source code base. And I'm not sure why every provider must be integrated into source code, follow the same unpredictable release schedule, review process and follow the same licence (although Apache is fine). It set a limit to scale development.

Furthermore, the integrated providers are very limited to some "standard" things. Some use cases are missing:

  • For example I need an autoscaler integration for hetzner. The fresh PR https://github.com/kubernetes/autoscaler/pull/3640 wouldn't fit, because I must (automatically) run some operations in rescue mode before giving the node to the pool. This process, probably, would be never integrated into autoscaler source code.
  • The API can be "send an email to the administrator to provide new nodes". In fact, it's how many companies work. And this could be an interface to include dedicated servers. For example netcup send a mail with credentials if a server is ready for provisioning.
  • Other side effects like custom billing, slack messages, inventory etc.
  • Another programming languages instead of go.
  • Interaction with non-kubeadm clusters. Currently, I'm experimenting with microk8s.

A more generic solution could send desired actions to a (single) configurable REST endpoint, possibly to a service inside the cluster. It would allow a decentralized and powerful way to create own autoscaler providers.

I'm aware of Cluster API and the Cluster API Provider. But I'm not sure how it contribute to above use cases.

Maybe I'm just not aware of an existing solution. Is there any workaround for above use cases? Any pointers are appropriate.

Bessonov avatar Oct 26 '20 15:10 Bessonov

Found two more use cases:

  • https://github.com/kubernetes/autoscaler/issues/3614
  • https://github.com/kubernetes/autoscaler/issues/3204

And a relevant quote from @MaciekPytel about scaling development:

We're happy to accept more provider integrations, but we (core developers) are unable to even support existing ones anymore. Instead each provider has its own owners who maintain it. So a prerequisite to accepting a new provider would be to have someone willing to take responsibility for it.

I think it shows how an generic interface could help.

Bessonov avatar Oct 26 '20 16:10 Bessonov

Found a proposal for generic (gRPC) API: https://github.com/kubernetes/autoscaler/pull/3140 .

Bessonov avatar Oct 26 '20 16:10 Bessonov

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Jan 24 '21 17:01 fejta-bot

/remove-lifecycle stale

Bessonov avatar Jan 24 '21 17:01 Bessonov

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

fejta-bot avatar Apr 24 '21 18:04 fejta-bot

/remove-lifecycle stale

unixfox avatar Apr 24 '21 18:04 unixfox

What's the status of plugable-provider-grpc.md? Seems to be the most promising.

@hectorj2f

technicianted avatar Jun 15 '21 23:06 technicianted

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Sep 13 '21 23:09 k8s-triage-robot

/remove-lifecycle stale /remove-lifecycle rotten

kfox1111 avatar Sep 13 '21 23:09 kfox1111

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-triage-robot avatar Oct 15 '21 13:10 k8s-triage-robot

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Oct 15 '21 13:10 k8s-ci-robot

/reopen

Bessonov avatar Oct 15 '21 16:10 Bessonov

@Bessonov: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Oct 15 '21 16:10 k8s-ci-robot

/remove-lifecycle rotten

Bessonov avatar Oct 15 '21 16:10 Bessonov

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 13 '22 17:01 k8s-triage-robot

/remove-lifecycle stale

Bessonov avatar Jan 13 '22 17:01 Bessonov

you might want to review / test this: https://github.com/kubernetes/autoscaler/pull/4654

dbonfigli avatar Jan 30 '22 17:01 dbonfigli

Oh, wow, thank you very much for your work and pointing to the implementation! I think now this issue can be closed.

Bessonov avatar Jan 30 '22 18:01 Bessonov

Why closing? The PR wasn't merged.

unixfox avatar Jan 30 '22 18:01 unixfox

As that PR isn't merged in yet would it be best to leave this issue open?

AverageMarcus avatar Jan 30 '22 18:01 AverageMarcus

Hey guys, I think this issue should be closed already after #3140 was merged. But I've no stakes to reopen it :)

Bessonov avatar Jan 30 '22 18:01 Bessonov

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 30 '22 18:04 k8s-triage-robot

/remove-lifecycle stale

AverageMarcus avatar Apr 30 '22 18:04 AverageMarcus

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jul 29 '22 19:07 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Aug 28 '22 20:08 k8s-triage-robot

https://github.com/kubernetes/autoscaler/pull/4654 has been merged and now the cluster autoscaler has a gRPC based plugin system, probably this issue can be closed.

dbonfigli avatar Aug 31 '22 18:08 dbonfigli