lychee-action icon indicating copy to clipboard operation
lychee-action copied to clipboard

GitHub rate limiting despite default `GITHUB_TOKEN`

Open jan-ferdinand opened this issue 7 months ago β€’ 24 comments

Hi there, thank you for your work on lychee as well as this github action. ❀

Recently, I started having CI failures due to GitHub's rate limiting, which hadn't happened before with this action.[^fail] As far as I understand, the lychee binary picks up on and uses the GITHUB_TOKEN environment variable by default, which should prevent rate limiting issues. It's not obvious how any of the recent commits to either this action or to lychee itself would have changed this behavior. Unfortunately, I'm at a bit of a loss: why is my action being rate limited?

[^fail]: This is the most recent failure at the time of writing.

jan-ferdinand avatar Apr 30 '25 07:04 jan-ferdinand

I see the same issue in my project: https://github.com/plabayo/rama/blob/f96ee2867e6211b7e50a28ef5f6da1d89c775188/.github/workflows/links.yml#L19

this always has worked before, do we need a different API key perhaps that has more bandwidth?

GlenDC avatar May 06 '25 18:05 GlenDC

Oh, I'm not aware of any breaking changes. We haven't changed the checking algorithm.

Looking at your summary, I noticed that you have 741 links.

# Summary
| Status        | Count |
|---------------|-------|
| πŸ” Total      | 741   |
| βœ… Successful | 288   |
| ⏳ Timeouts   | 0     |
| πŸ”€ Redirected | 0     |
| πŸ‘» Excluded   | 1     |
| ❓ Unknown    | 0     |
| 🚫 Errors     | 452   |

I checked the pipeline and noticed that you run the link check on every push to master and every PR targeting `master. So you could be hitting GitHub API rate limits if you have more than two such events per hour. Not all of these links have to be GitHub links, but they quickly add up!

Another issue appears in your workflow logs:

[WARN] Cache is too old (age: 39d 20h 34m 25s, max age: 1d 0h 0m 0s). Discarding and recreating.

So, in summary, I guess you're facing two problems:

  1. GitHub API rate limiting (1,000 requests per hour per repository)
  2. Cache not updating when the link check fails

Here's what you could try:

on:
  workflow_dispatch:
  push:
    branches:
      - master
  # TODO: Remove this section:
  # pull_request:
  #  branches:
  #    - master
  schedule:
    - cron: "0 0 * * *"

- name: Restore lychee cache
  id: restore-cache
  uses: actions/cache/restore@v4
  with:
    path: .lycheecache
    key: cache-lychee-${{ github.sha }}
    restore-keys: cache-lychee-

- name: Link Checker
  uses: lycheeverse/[email protected]
  id: lychee_run
  with:
    args: |
      --cache
      --verbose
      --no-progress
      '.'
  continue-on-error: true

- name: Save lychee cache
  uses: actions/cache/save@v4
  if: always()
  with:
    path: .lycheecache
    key: ${{ steps.restore-cache.outputs.cache-primary-key }}

This solution splits the cache operations into separate restore and save steps and uses if: always() to ensure the cache is saved even when the link check fails. I haven't tried it, but I'd be thankful if you gave it a shot. We can update the docs accordingly if it works, because I guess a lot of people might run into this problem now or in the future as their repos grow.

The part about the workflow triggers at the start of the YAML file isn't strictly necessary. I think the important part is to get the cache fixed.

@GlenDC, looking at a recent run, I can also see that you have a lot of links:

# Summary

| Status        | Count |
|---------------|-------|
| πŸ” Total      | 1434  |
| βœ… Successful | 1309  |
| ⏳ Timeouts   | 0     |
| πŸ”€ Redirected | 0     |
| πŸ‘» Excluded   | 44    |
| ❓ Unknown    | 0     |
| 🚫 Errors     | 81    |

It looks like only around 300 of the 1434 links are GitHub links, though, and you only run lychee once per day so that's weird.

My hunch is, that you run into a secondary rate limiting problem with GitHub as described in the docs:

In addition to primary rate limits, GitHub enforces secondary rate limits in order to prevent abuse and keep the API available for all users.

You may encounter a secondary rate limit if you:

  • Make too many concurrent requests. No more than 100 concurrent requests are allowed. This limit is shared across the REST API and GraphQL API.

So if more than 100 requests to GitHub run concurrently (and it's likely since lychee can run way more requests concurrently), you might be unfortunate enough to be rate-limited.

Can you try to set the max-concurrency lower for a test? --max-concurrency 20 --max-retries 0 or so.

Also, can you add a cache similar to how @jan-ferdinand does it in their pipeline?

mre avatar May 08 '25 20:05 mre

Can you try to set the max-concurrency lower for a test? --max-concurrency 20 --max-retries 0 or so.

Currently I do not set any args myself. Can you recommend me what exactly to change in https://github.com/plabayo/rama/blob/main/.github/workflows/links.yml please?

GlenDC avatar May 08 '25 20:05 GlenDC

      - name: Link Checker
        id: lychee
        uses: lycheeverse/[email protected]
        with:
          args: --base . --verbose --no-progress './**/*.md' './**/*.html' './**/*.rst' --max-concurrency 20 --max-retries 0
          token: ${{ secrets.GITHUB_TOKEN }}

mre avatar May 08 '25 21:05 mre

Thx. Updated it. Will let you know how it went tomorrow!

GlenDC avatar May 08 '25 21:05 GlenDC

Thanks for your help! I like the changes you suggest and have incorporated them. They might fix the issue; I'll report back if they do.

From the symptoms I have seen so far, I think it's most likely that github has changed their secondary rate limiting policy somehow.[^conclusion][^change] I'm sure that the workflow did not hit the (documented) primary rate limit, as the workflow was triggered by only the schedule for several days in a row. As far as I can see, with the new caching behavior, having on-push and on-PR triggers should be unproblematic.

A small note regarding the suggested change to the workflow file, in particular to the β€œSave lychee cache” step: the docs discourage if: always(), since that makes it impossible to cancel the workflow during that step. They suggest if: ${{ !cancelled() }} instead.[^always]

[^always]: I'm aware that the docs for the cache/save action suggest always(), but the other documentation seems more authoritative on that front.

[^change]: I can't see this reflected in the docs. [^conclusion]: I conclude this because (1) my workflow file used to work, (2) I have not changed my workflow file recently, (3) lychee-action has no recent changes that might impact github rate limiting, and (4) lychee has no recent changes that might impact github rate limiting.

jan-ferdinand avatar May 09 '25 08:05 jan-ferdinand

Aha! Thanks for the note about if: always(). In case this ends up fixing the problem, we should take that into account when updating the docs.

mre avatar May 09 '25 09:05 mre

Didn't seem to work @mre : https://github.com/plabayo/rama/actions/runs/14934933577

GlenDC avatar May 09 '25 19:05 GlenDC

I don't know. It looks like you get blocked pretty rapidly, though? Do you use the GitHub token anywhere else? Maybe you'd like to create a new token and try again?

mre avatar May 10 '25 12:05 mre

I don't know. It looks like you get blocked pretty rapidly, though? Do you use the GitHub token anywhere else? Maybe you'd like to create a new token and try again?

Pretty certain that's the Github token provided by github automatically for Action runners. I didn't have to set that and neither do I see it. https://docs.github.com/en/actions/security-for-github-actions/security-guides/automatic-token-authentication <= https://github.com/plabayo/rama/blob/cc59ef6b3b3f0baa9b641e16cb13fe1985fd407c/.github/workflows/links.yml#L20

GlenDC avatar May 10 '25 19:05 GlenDC

Ah, I see. But you could create a custom token and pass that in for a test. Description is in the docs. I just want to know if that helps. There is also a way to get the current rate limit for a token via an API call.

mre avatar May 11 '25 02:05 mre

API call: https://www.endorlabs.com/learn/how-to-get-the-most-out-of-github-api-rate-limits

curl \
  -H "Accept: application/vnd.github+json" \ 
  -H "Authorization: token <TOKEN>" \
  https://api.github.com/rate_limit

{
  "resources": {
    "core": {
      "limit": 5000,
      "remaining": 4999,
      "reset": 1372700873,
      "used": 1
    },
    "search": {
      "limit": 30,
      "remaining": 18,
      "reset": 1372697452,
      "used": 12
    },
    "graphql": {
      "limit": 5000,
      "remaining": 4993,
      "reset": 1372700389,
      "used": 7
    },
    "integration_manifest": {
      "limit": 5000,
      "remaining": 4999,
      "reset": 1551806725,
      "used": 1
    },
    "code_scanning_upload": {
      "limit": 500,
      "remaining": 499,
      "reset": 1551806725,
      "used": 1
    }
  },
  "rate": {
    "limit": 5000,
    "remaining": 4999,
    "reset": 1372700873,
    "used": 1
  }
}

mre avatar May 11 '25 02:05 mre

What I've found is that lychee currently doesn't appear to de-duplicate requests to the same URL. I've now reduced the concurrency to 1 and added a cache which somewhat mitigates this but means the execution takes much longer obviously. It is not a big deal because it runs on a schedule anyway.

thomaseizinger avatar May 12 '25 01:05 thomaseizinger

Similar to and inspired by @thomaseizinger, I'm now using the following, rather conservative arguments for lychee:

--cache
--cache-exclude-status 400..=599
--max-concurrency 1
--max-retries 1
--retry-wait-time 60

Unsurprisingly, this slows down the workflow considerably (~2 hours over the previous 20 seconds), but at least it provides accurate results once again.

jan-ferdinand avatar May 12 '25 07:05 jan-ferdinand

I sure hope that https://github.com/lycheeverse/lychee/issues/1605 will eventually fix this.

mre avatar May 12 '25 21:05 mre

We're running into the same issue here, but some of the workarounds proposed here, will take the workflow duration from 20 seconds to 2 hours which is a no go for us. Will this https://github.com/lycheeverse/lychee/issues/1605 be fixed any time soon? I see targeted for v1.0 but that looks quite far in future

afalhambra-hivemq avatar May 13 '25 15:05 afalhambra-hivemq

Will this https://github.com/lycheeverse/lychee/issues/1605 be fixed any time soon? I see targeted for v1.0 but that looks quite far in future

It absolutely will. 1.0 just means that is a goal we set for ourselves to unlock a 1.0 release, but we hope to add the feature sooner than that. It looks like GitHub made some changes on the secondary rate limiting behavior and all users are affected.

mre avatar May 13 '25 23:05 mre

I just saw this discussion and the GitHub announcement, which explains the situation. https://news.ycombinator.com/item?id=43936992

Some users commented that GitHub might have started rolling out the change before the announcement, which matches with what we saw.

So I don't think they're a simple fix. I will try to expedite work on per-host rate-limiting.

mre avatar May 15 '25 06:05 mre

Thanks. Fwiw, for us this is just a cronjob running once per day in background. Not something I'm waiting on. So if I have a way to make it work (eventually) by trading in time, that is just fine. At least for us.

GlenDC avatar May 15 '25 06:05 GlenDC

I just saw this discussion and the GitHub announcement, which explains the situation. news.ycombinator.com/item?id=43936992

Some users commented that GitHub might have started rolling out the change before the announcement, which matches with what we saw.

So usage from GitHub actions runners with $GITHUB_TOKEN is considered unauthenticated? That is odd.

thomaseizinger avatar May 15 '25 07:05 thomaseizinger

That's not how I read it:

If you rely on unauthenticated access, you may experience the new rate limits. However, developers using authenticated requests will continue to enjoy higher rate limits with uninterrupted access to their workflows and tools.

But they might have changed the secondary rate-limit at the same time. (The one that avoids bursts of requests.) I'm honestly not sure what's going on quite yet.

mre avatar May 15 '25 08:05 mre

Any new guidelines / intermediate measures that can be taken here?

GlenDC avatar Jun 04 '25 13:06 GlenDC

BTW the rate limiting issues have disappeared in https://github.com/plabayo/rama/. Did GitHub change their mind again?

GlenDC avatar Aug 05 '25 11:08 GlenDC

Maybe? At least we haven't changed anything on the lychee side. Thanks for the update.

mre avatar Aug 07 '25 23:08 mre