augur icon indicating copy to clipboard operation
augur copied to clipboard

Figure out how to make GitHub link checking work consistently

Open genehack opened this issue 7 months ago • 1 comments

Link checks of GitHub URLs have been disabled due to stochastic CI failures caused by rate-limiting, but we should probably try to figure out how to turn them back on.

@tsibley sez:

I wonder though: linkcheck has support for rate-limiting itself and is supposed to back off and retry. Is that not working as intended? Maybe we need to adjust its config rather than exclude an entire domain of links, esp. ones which I expect are somewhat common and particularly prone to breakage (e.g. source code moves when a ref name is used in the URL instead of a commit id).

and then continues:

Linkcheck also has support for per-domain auth, so we could pass in unprivileged GitHub creds in CI.

finally

Re: linkcheck's rate-limiting support: it requires a 429 ("Too Many Requests") response and a quick scan of failed CI suggests GitHub issues 403s instead in this case?

The "pass in creds via CI" plan seems like the right approach, but other ideas are welcome.

genehack avatar May 22 '25 17:05 genehack

I just noticed in the docs.nextstrain.org CI linkcheck job that linkcheck seems to be doing the right thing by rate-limiting itself

Image

joverlee521 avatar Jun 06 '25 18:06 joverlee521