github-api Implement "Secondary rate limit" behavior to internally throttle querying

See #1805
See #1842 See #2009

This docs page describes secondary rate limit behavior: https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28#about-secondary-rate-limits

As of this reading it says:

You may encounter a secondary rate limit if you:

Make too many concurrent requests. No more than 100 concurrent requests are allowed. This limit is shared across the REST API and GraphQL API.

Make too many requests to a single endpoint per minute. No more than 900 points per minute are allowed for REST API endpoints, and no more than 2,000 points per minute are allowed for the GraphQL API endpoint. For more information about points, see "Calculating points for the secondary rate limit."

Make too many requests per minute. No more than 90 seconds of CPU time per 60 seconds of real time is allowed. No more than 60 seconds of this CPU time may be for the GraphQL API. You can roughly estimate the CPU time by measuring the total response time for your API requests.

Create too much content on GitHub in a short amount of time. In general, no more than 80 content-generating requests per minute and no more than 500 content-generating requests per hour are allowed. Some endpoints have lower content creation limits. Content creation limits include actions taken on the GitHub web interface as well as via the REST API and GraphQL API.

These secondary rate limits are subject to change without notice. You may also encounter a secondary rate limit for undisclosed reasons.

These are incredibly loosely defined guides and you cannot query for them ahead of time. 👎 It looks like we need to take the path some users have suggested and make rate limiting much more resilient, potentially allowing users to write their own rate limit strategies for handling secondary rate limits.

The current internal GitHubRateLimitChecker would need to be replaced by a PrimaryGitHubRateLimiter which extends a new GitHubRateLimiter class/interface. Then each of the above bullet points would become a new rate limit tracking/enforcing class. All of them would need to be called before and after each query, and maintain their own configuration and calculated state. GitHubRateLimiter would provide the API and possibly helper functions to make that easier to do right.

I think the basic API would be that the method call before a request is sent, would return an Optional<Duration> and if more than one limiter returns a Duration the longest one is used. Or maybe return an optional record that includes a reason message and a duration, perhaps also a logLevel/severity. Make it easier to produce meaningful output.

Oct 22 '24 07:10 bitwiseman

I'm getting this error from two different locations when I attempt to search at all. Something appears to have gone sideways with the implementation.

I understand the need for ratelimiting, but this doesn't seem like it was tested thoroughly enough prior to release.

Oct 22 '24 17:10 realskudd

@realskudd

I'm not sure what you're referring to. This is a general task. No changes were made in this library, so it couldn't have been "pushed too quickly".

If you want to be angry take it up with GitHub. Or you could submit PRs to help this library handle the changes.

In the meantime, I suggest you file a separate issue describing the problem and including the code to reproduce it. Then we might be able to help you work around the problem.

Oct 25 '24 16:10 bitwiseman

@bitwiseman I think it will make sense actually to add retries with backoff to calls to GitHub. Ideally if from what GitHub returned we can get error that is relate to ephemeral rate limiting and try to retry.

From developers side we can control count of retries through some parameters.

Oct 26 '24 14:10 Hronom

@Hronom

GitHub expects clients to avoid exceeding the primary or secondary rate limit. Accordingly, they make the penalty for exceeding it very high.

We already accurately handle primary rate limits because GitHub provides actionable information the response headers.

No information is provided for secondary rate limits. Reacting after we exceed the secondary rate limit is what we already do.

We already have retries and waits. The number and duration of the waits are configurable. We already attempt to determine if errors are due secondary rate limits to the degree GitHub lets us.

Oct 26 '24 18:10 bitwiseman

@bitwiseman didn't know that there possibility to retry failed calls, is there some examples?

Oct 27 '24 00:10 Hronom