cluster-api
cluster-api copied to clipboard
Explore alternatives to rate limited GitHub API for clusterctl
User Story
As a user, I would like to use clusterctl without the risk of incurring in GitHub API rate limits.
Detailed Description
Clusterctl fetches artifacts from the provider's repository, and if the repository is hosted on GitHub this operation is subject to GitHub API rate limits.
https://github.com/kubernetes-sigs/cluster-api/issues/2450 already raised this problem, and https://github.com/kubernetes-sigs/cluster-api/pull/2848 implemented caching so now clusterctl init executes 8 API calls (+3 for each additional provider), which is ~1/6 of the total number of allowed calls in 1hr (without a GitHub token).
While this is generally ok for the majority of the users, there is still a chance that user (especially developers or CI tools) incur in this limit, so we should explore alternatives for overcoming these limits e.g.
- hosting artifacts somewhere else
- using alternative options for fetching artifacts (see https://github.com/kubernetes-sigs/cluster-api/issues/2450#issuecomment-607395977)
- ???
/kind feature /area clusterctl
/milestone v0.4.0
/help
@fabriziopandini: This request has been marked as needing help from a contributor.
Please ensure the request meets the requirements listed here.
If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.
In response to this:
/help
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
I'd make use of Go module proxy + raw.github (wherever possible)
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
/lifecycle frozen
Can we prioritize this issue a bit more? GitHub rate limits are pretty fast to hit and that might impact the user experience, could we move to use go modules like we do in https://github.com/kubernetes-sigs/cluster-api/blob/d2a110e53f48614654b7ee99afc8e16bdc9ad973/hack/tools/mdbook/releaselink/releaselink.go#L61-L84?
/milestone v1.0
/priority important-soon
Users (especially developers) are regularly hitting this issue.
/triage accepted /help-wanted
Still encountering this issue (even with a personal token setup):
Fetching providers
Error: failed to get provider components for the "cluster-api" provider: failed to get repository client for the CoreProvider with name cluster-api: error creating the GitHub repository client: failed to get GitHub latest version: failed to get repository versions: failed to get repository versions: rate limit for github api has been reached. Please wait one hour or get a personal API token and assign it to the GITHUB_TOKEN environment variable
Version info:
clusterctl version
clusterctl version: &version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.1", GitCommit:"8b4214d72762394144b83dd6d14986ff7e274095", GitTreeState:"clean", BuildDate:"2022-08-12T16:26:37Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"}
Hm I think that should not happen. Never had the problem on my machine. Maybe there is a way to confirm that the token / your account is really over the rate limit?
Sorry super stupid question but I assume the GITHUB_TOKEN env var is exported?
I'll take a look and try to at least implement the improvement mentioned by @vincepri here: https://github.com/kubernetes-sigs/cluster-api/issues/3982#issuecomment-946793975 😃
I created a fresh PAT and it seems to work (I may have tested with an old PAT). I will keep an eye and report if it happens again.
The PAT requirement should be documented though.
The PAT requirement should be documented though.
+1. I will open an issue for this.
After reading the list of releases from go modules, the next big step is to find a way to download release assets without using GitHub API
Might be a stupid question but as we only have the sources in the go modules (https://pkg.go.dev/sigs.k8s.io/cluster-api#section-directories), from where else can we get the built/rendered release assets?
I did not research the topic extensively, but if we can build an url like https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.2.2/bootstrap-components.yaml with the info we have, I assume we can download from there without passing from the GitHub API, similar to when I click to links in the asset section in https://github.com/kubernetes-sigs/cluster-api/releases/tag/v1.2.2 page
In other words, we will continue to get the assets from GitHub but using HTTP calls instead of the the GitHub APIs
This issue is labeled with priority/important-soon but has not been updated in over 90 days, and should be re-triaged.
Important-soon issues must be staffed and worked on either currently, or very soon, ideally in time for the next release.
You can:
- Confirm that this issue is still relevant with
/triage accepted(org members only) - Deprioritize it with
/priority important-longtermor/priority backlog - Close this issue with
/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
/triage accepted
Any updates on this ?
@mjnovice - nobody has followed up on this as a general case as far as I know. Have you run into this issue?
@killianmuldoon we have a pipeline which creates and destroys clusterAPI on kind cluster, and we constantly get throttled by this.
Any way we can circumvent this ?
Any way we can circumvent this ?
Circumvention for this is to use a github token, which should get rid of most of the rate limiting, or else use a local repository with the yamls that github normally tries to get from the internet. You should also pin versions to prevent clustertctl from looking for it.
+1 to what @killianmuldoon said. The bulk of the calls that clusterctl makes are to look up versions and download files from the github releases. There were some improvements made to try and redude the gihtub aoi calls by using goporxy to look up versions but the best ways to avoid rate limiting are as Killian mentioned:
- Use a github token
- Use a local repository
- Pin versions (this only helps with version look ups but will still need to access github to download files)
This issue is labeled with priority/important-soon but has not been updated in over 90 days, and should be re-triaged.
Important-soon issues must be staffed and worked on either currently, or very soon, ideally in time for the next release.
You can:
- Confirm that this issue is still relevant with
/triage accepted(org members only) - Deprioritize it with
/priority important-longtermor/priority backlog - Close this issue with
/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
The remaining items in this issue have been addressed by https://github.com/kubernetes-sigs/cluster-api/pull/9237 and https://github.com/kubernetes-sigs/cluster-api/pull/9236
/close
@fabriziopandini: Closing this issue.
In response to this:
The remaining items in this issue should now be addressed by https://github.com/kubernetes-sigs/cluster-api/pull/9237 and https://github.com/kubernetes-sigs/cluster-api/pull/9236
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.