github-api icon indicating copy to clipboard operation
github-api copied to clipboard

`GitHubAbuseLimitHandler#isError` fails to detect some secondary rate limit errors

Open yrodiere opened this issue 11 months ago • 9 comments

(Maintainer's note: see also #1975 ) Describe the bug

#1895 attempted to detect more errors related to secondary rate limits, but it seems there are more.

It appears GitHub APIs -- https://api.github.com/search/issues in particular, maybe more -- can hit a secondary limit and just return a 403 error, with no particular header indicating that a limit was reached. No Retry-After, no gh-limited-by, nothing. Only the body response explains "You have exceeded a secondary rate limit. Please wait a few minutes before you try again". (and in fact waiting 30s/1min is enough).

This, understandably, doesn't get caught by GitHubAbuseLimitHandler#isError, and results in the request simply failing.

To Reproduce

Send the following request a few dozen times (or just a dozen when unauthenticated):

https://api.github.com/search/issues?sort=updated&order=desc&q=repo%3Ayrodiere%2Fquarkus-github-playground+is%3Aissue+is%3Aopen+label%3Atriage%2Fneeds-feedback%2Ctriage%2Fneeds-reproducer+label%3Aarea%2Fhibernate-search+updated%3A%3C2025-01-15T12%3A03%3A47.243460581

Obviously my actual use case is not to send the same request over and over, but it's enough to reproduce the problem.

Interestingly, you can just click that URL in your browser, refresh quickly a couple times, and you'll be able to observe the exact error I'm running into: HTTP 403 with no particular retry header.

See bottom of this messsage for more logs.

Expected behavior

Ideally, I'd expect such requests to be detected as secondary rate limit errors, and the client to just wait and retry. For my use case it's fine to wait a minute.

Failing that, I'd settle for a way to override GitHubAbuseLimitHandler#isError and implement some dirty logic based on the response body in there. But this method is currently package-protected.

Desktop (please complete the following information):

  • OS: Fedora 41
  • Browser [e.g. chrome, safari]: no browser involved
  • Version [e.g. 22]: no browser involved

Additional context

Here is a failing request/response with all details/headers. I got it using -Djdk.httpclient.HttpClient.log=all.

2025-01-15 13:04:53,295 INFO  [jdk.htt.HttpClient] (executor-thread-1) REQUEST: https://api.github.com/search/issues?sort=updated&order=desc&q=repo%3Ayrodiere%2Fquarkus-github-playground+is%3Aissue+is%3Aopen+label%3Atriage%2Fneeds-feedback%2Ctriage%2Fneeds-reproducer+label%3Aarea%2Fhibernate-search+updated%3A%3C2025-01-15T12%3A03%3A47.243460581 GET
2025-01-15 13:04:53,295 INFO  [jdk.htt.HttpClient] (executor-thread-1) HEADERS: REQUEST HEADERS:
GET /search/issues?sort=updated&order=desc&q=repo%3Ayrodiere%2Fquarkus-github-playground+is%3Aissue+is%3Aopen+label%3Atriage%2Fneeds-feedback%2Ctriage%2Fneeds-reproducer+label%3Aarea%2Fhibernate-search+updated%3A%3C2025-01-15T12%3A03%3A47.243460581 HTTP/1.1
Content-Length: 0
Host: api.github.com
User-Agent: Java-http-client/17.0.13
Accept: application/vnd.github+json
Accept-Encoding: gzip
Authorization: token <NOPE>
X-GitHub-Api-Version: 2022-11-28


2025-01-15 13:04:53,422 INFO  [jdk.htt.HttpClient] (HttpClient-3-Worker-0) HEADERS: RESPONSE HEADERS:
    access-control-allow-origin: *
    access-control-expose-headers: ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset
    content-encoding: gzip
    content-security-policy: default-src 'none'
    content-type: application/json; charset=utf-8
    date: Wed, 15 Jan 2025 12:04:53 GMT
    referrer-policy: origin-when-cross-origin, strict-origin-when-cross-origin
    server: github.com
    strict-transport-security: max-age=31536000; includeSubdomains; preload
    transfer-encoding: chunked
    vary: Accept-Encoding, Accept, X-Requested-With
    x-content-type-options: nosniff
    x-frame-options: deny
    x-github-api-version-selected: 2022-11-28
    x-github-media-type: github.v3; format=json
    x-github-request-id: <NOPE>
    x-ratelimit-limit: 30
    x-ratelimit-remaining: 21
    x-ratelimit-reset: 1736942747
    x-ratelimit-resource: search
    x-ratelimit-used: 9
    x-xss-protection: 0


2025-01-15 13:04:53,423 INFO  [jdk.htt.HttpClient] (HttpClient-3-Worker-0) RESPONSE: (GET https://api.github.com/search/issues?sort=updated&order=desc&q=repo%3Ayrodiere%2Fquarkus-github-playground+is%3Aissue+is%3Aopen+label%3Atriage%2Fneeds-feedback%2Ctriage%2Fneeds-reproducer+label%3Aarea%2Fhibernate-search+updated%3A%3C2025-01-15T12%3A03%3A47.243460581) 403 HTTP_1_1 Local port:  57158

I end up with this error, proving the secondary limit error wasn't detected:

org.kohsuke.github.GHException: Failed to retrieve https://api.github.com/search/issues?sort=updated&order=desc&q=repo%3Ayrodiere%2Fquarkus-github-playground+is%3Aissue+is%3Aopen+label%3Atriage%2Fneeds-feedback%2Ctriage%2Fneeds-reproducer+label%3Aarea%2Fhibernate-search+updated%3A%3C2025-01-15T12%3A03%3A47.243460581
        at org.kohsuke.github.GitHubPageIterator.fetch(GitHubPageIterator.java:157)
        at org.kohsuke.github.GitHubPageIterator.hasNext(GitHubPageIterator.java:93)
        at org.kohsuke.github.PagedSearchIterable$1.hasNext(PagedSearchIterable.java:111)
        at org.kohsuke.github.PagedIterator.fetch(PagedIterator.java:116)
        at org.kohsuke.github.PagedIterator.hasNext(PagedIterator.java:84)
        at java.base/java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1855)
        at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:292)
        at java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)
        at java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:161)
        at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:298)
        at java.base/java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
        at io.quarkus.github.lottery.draw.Lottery$Draw.runSingleRound(Lottery.java:286)
        at io.quarkus.github.lottery.draw.Lottery.draw(Lottery.java:80)
        at io.quarkus.github.lottery.LotteryService.doDrawForRepository(LotteryService.java:109)
        at io.quarkus.github.lottery.LotteryService.drawForRepository(LotteryService.java:82)
        at io.quarkus.github.lottery.LotteryService.draw(LotteryService.java:66)
        at io.quarkus.github.lottery.LotteryService_ClientProxy.draw(Unknown Source)
        at io.quarkus.github.lottery.LotteryCli$DrawCommand.run(LotteryCli.java:27)
        at io.quarkus.github.lottery.LotteryCliCommandDispatcherImpl.dispatch(Unknown Source)
        at io.quarkus.github.lottery.LotteryCliCommandDispatcherImpl_Multiplexer.dispatch_9c21b751bd1c3c76c9e9f8d4cbcacbc05e9e7ed7(Unknown Source)
        at io.quarkus.github.lottery.LotteryCliCommandDispatcherImpl_Multiplexer_Observer_dispatch_9c21b751bd1c3c76c9e9f8d4cbcacbc05e9e7ed7_usg3oRfZFhFtsdK_zfgxrYi5m6Q.notify(Unknown Source)
        at io.quarkus.arc.impl.EventImpl$Notifier.notifyObservers(EventImpl.java:351)
        at io.quarkus.arc.impl.EventImpl$Notifier.notify(EventImpl.java:333)
        at io.quarkus.arc.impl.EventImpl$1.get(EventImpl.java:110)
        at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
        at io.quarkus.vertx.core.runtime.VertxCoreRecorder$15.runWith(VertxCoreRecorder.java:637)
        at org.jboss.threads.EnhancedQueueExecutor$Task.doRunWith(EnhancedQueueExecutor.java:2675)
        at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2654)
        at org.jboss.threads.EnhancedQueueExecutor.runThreadBody(EnhancedQueueExecutor.java:1627)
        at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1594)
        at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:11)
        at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:11)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: org.kohsuke.github.HttpException: {
  "documentation_url": "https://docs.github.com/free-pro-team@latest/rest/overview/rate-limits-for-the-rest-api#about-secondary-rate-limits",
  "message": "You have exceeded a secondary rate limit. Please wait a few minutes before you try again. If you reach out to GitHub Support for help, please include the request ID 6F00:24CFC5:<NOPE>."
}
        at org.kohsuke.github.GitHubConnectorResponseErrorHandler$1.onError(GitHubConnectorResponseErrorHandler.java:83)
        at org.kohsuke.github.GitHubClient.detectKnownErrors(GitHubClient.java:504)
        at org.kohsuke.github.GitHubClient.sendRequest(GitHubClient.java:464)
        at org.kohsuke.github.GitHubPageIterator.fetch(GitHubPageIterator.java:146)
        ... 33 more

yrodiere avatar Jan 15 '25 12:01 yrodiere

FWIW I submitted a ticket to GitHub support to see if they would consider it a bug that there's no useful header in the response. We'll see.

yrodiere avatar Jan 15 '25 13:01 yrodiere

FWIW I submitted a ticket to GitHub support to see if they would consider it a bug that there's no useful header in the response. We'll see.

I got answers, and it's basically working as intended.

  1. Not adding Retry-After is essentially obfuscation designed to protect the service against abuse.
  2. The gh-limited-by header is never returned by github.com (only some GitHub Enterprise installs), and there are no plans to add it to this particular endpoint and/or to github.com.

I was told my feedback would be forwarded to the relevant team, but the message is clear: at the moment, there are no plans to make this situation easier to detect than through parsing the response body.

yrodiere avatar Jan 20 '25 16:01 yrodiere

And neither using a status code?

On Mon, Jan 20, 2025 at 5:03 PM Yoann Rodière @.***> wrote:

FWIW I submitted a ticket to GitHub support to see if they would consider it a bug that there's no useful header in the response. We'll see.

I got answers, and it's basically working as intended.

  1. Not adding Retry-After is essentially obfuscation designed to protect the service against abuse.
  2. The gh-limited-by header is never returned by github.com (only some GitHub Enterprise installs), and there are no plans to add it.

I was told my feedback would be forwarded to the relevant team, but the message is clear: at the moment, there are no plans to make this situation easier to detect than through parsing the response body.

— Reply to this email directly, view it on GitHub https://github.com/hub4j/github-api/issues/2009#issuecomment-2602787414, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJYOBJRWIKDKAB67LHDFD32LUM3JAVCNFSM6AAAAABVHDXZUCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMBSG44DONBRGQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

gsmet avatar Jan 20 '25 17:01 gsmet

And neither using a status code?

Apparently not. Probably part of the obfuscation strategy. But then it's hard to ask for details about an obfuscation strategy :)

I tried to push a bit but the position is clear: "not a bug, though we understand your position and will forward your concerns to relevant team." (I'm paraphrasing)

I'd say we can hope for improvements, but not anytime soon, and we'd need contingency plans.

FWIW something like this works around the issue: https://github.com/quarkusio/quarkus-github-lottery/pull/208/files#diff-0cdbca0be53fabaaac67caa063012bbbece2560b1a716c6d4b5c6f1447d8bc77 It could possibly be implemented directly in GitHubAbuseLimitHandler.

yrodiere avatar Jan 21 '25 09:01 yrodiere

@bitwiseman hey! Would you be willing to accept an ad hoc patch similar to the checks @yrodiere is doing ^ to work around the issue?

Would be hackyish until GitHub decides to do something about it (might be forever) but probably better than not handling this case properly?

Thanks!

gsmet avatar Jan 21 '25 09:01 gsmet

Hello all! We've had this issue internally also, with the abuse limit handler not properly detecting when we hit a secondary / abuse limit. Having talked with Github support, and some other internal testing with this library and the Github API within my team etc. here is what I can glean:

  1. As mentioned by @yrodiere, gh-limited-by is never actually used as part of abuse limiting. Additionally, the Retry-After header can be obfuscated.

  2. Not only is the header obfuscated, the exact limits are obfuscated due to the needs of the Github platform. When the header is missing, an exponential dropoff with some randomness to avoid the "thundering herd" problem is recommended by the support agent.

  3. Headers could potentially come back as lower case. This might mean that Retry-After would not be detected, as it would come back as retry-after, and the current header retrieval behaviour checks the headers in a case sensitive way. This is more towards usage of HTTP 1.1 vs 2.0, but currently the code in this library does not seem consistent in which it wants to support - e.g. Retry-After vs gh-limited-by.

  4. The message can contain one of two things: "You have exceeded a secondary rate limit", or "You have triggered an abuse detection mechanism". 5. isError is not overridable like onError is, so this is not fixable locally.

I think the following might be suitable for changes:

  • To resolve the detection issues, the default isError is expanded to check for the retry-after header in a case sensitive way, with a response message check as a last case resort to detect these when the header(s) are not available

  • There is a default waiting of 60 seconds, which I think is sufficient. However, maybe an exponential backoff could be implemented as an option for clients who would want to wait less?

  • To resolve the potential for retry-after not being missed, the GithubConnectorResponse#header(String) method could be extended to be case insensitive, to be compatable with both HTTP 1.1 and 2.0.

  • In the case that some of these are not desired in the library, isError should be overridable so a client can check headers and messages in their own way.

I plan to open some PRs for these (particularly for updating isError behaviour changes, making isError overridable, and potentially making headers case-insensitive), but any feedback or additional thoughts on these would be appreciated! 🙏

rozza-sb avatar Aug 20 '25 15:08 rozza-sb

@rozza-sb

It looks like this comment go lost in the pile of notifications, or maybe I just hadn't checked back recently.

  1. Okay, if it isn't used, it should be removed. I'm not sure what you mean by obfuscated.
  2. Sure, a centrally implemented exponential backoff sounds like a good feature. Well tested PR welcome.
  3. From what I can see this is not true, the header check should be case-insensitive. If it isn't, that is separate bug that should be fixed. https://github.com/hub4j/github-api/blob/40f3b4dc74f1fc3021f1bc07878c82e5250ad76c/src/main/java/org/kohsuke/github/connector/GitHubConnectorResponse.java#L96-L99
  4. This is great information and universally useful. It should be submitted as a PR to this project, not fixed locally by individual projects.

Thanks for submitting #2127, but of the proposed options you mentioned above that is only one I'd push back on at all. We can discuss further there.

bitwiseman avatar Sep 04 '25 15:09 bitwiseman

@bitwiseman thanks for getting back to me!

  1. Apologies for the confusion. gh-limited-by is the header that isn't used, as confirmed by my conversation with Github support. Separately from that but also confirmed by my support interaction, Retry-After can be obfuscated, as in not appear in the response. Therefore, the current check of Retry-After and gh-limited-by can fail due to the former header possibly not being present, and then the latter header never being present.
  2. Not certain if I specifically have the time for this, but I agree that it would be a good addition 👍
  3. I seem to have missed this instantiation of the case insensitive map - having only checked the #header(String name) method, it was assumed that headers was a standard case sensitive map. Thanks for the catch! This would also mean that the potential lack of Retry-After is the only thing that would cause a false negative, rather than a case mismatch.
  4. +1 on this being the desired fix. Happy to discuss about the fate of #2127 there, as I assume that conflicts with this point 😄

rozza-sb avatar Sep 04 '25 15:09 rozza-sb

@bitwiseman following on from your suggestion in #2127 to abstract the response body checking logic, are we needing / wanting to support multiple checking strategies?

You mentioned both checking N lines, and 2kb of content - it's possible we can support both cases as separate checker types. (with the latter checking the first N-kb for flexibility). Other types might possibly spawn from this also - one initial thought would be for checking specific keys in a response that we know to be JSON formatted.

Alternatively, we can have a single checker that does a single strategy that we use everywhere, which I see as just checking up to a certain number of bytes from the start of the response body.

rozza-sb avatar Sep 05 '25 19:09 rozza-sb