Figure out how to make GitHub link checking work consistently
Link checks of GitHub URLs have been disabled due to stochastic CI failures caused by rate-limiting, but we should probably try to figure out how to turn them back on.
@tsibley sez:
I wonder though: linkcheck has support for rate-limiting itself and is supposed to back off and retry. Is that not working as intended? Maybe we need to adjust its config rather than exclude an entire domain of links, esp. ones which I expect are somewhat common and particularly prone to breakage (e.g. source code moves when a ref name is used in the URL instead of a commit id).
and then continues:
Linkcheck also has support for per-domain auth, so we could pass in unprivileged GitHub creds in CI.
Re: linkcheck's rate-limiting support: it requires a 429 ("Too Many Requests") response and a quick scan of failed CI suggests GitHub issues 403s instead in this case?
The "pass in creds via CI" plan seems like the right approach, but other ideas are welcome.
I just noticed in the docs.nextstrain.org CI linkcheck job that linkcheck seems to be doing the right thing by rate-limiting itself