numpy.org icon indicating copy to clipboard operation
numpy.org copied to clipboard

Add a link checker in CI to avoid broken links

Open rgommers opened this issue 5 years ago • 1 comments

This is a follow-up to gh-232. Getting a link checker to work with Hugo isn't completely trivial, but should be doable in CI. See https://discourse.gohugo.io/t/link-checker-for-go-hugo/2202/18

rgommers avatar May 20 '20 18:05 rgommers

I gave this a quick try, this is doable without too much effort. And there are a lot of things to fix. Steps:

  • Install htmltest locally, see https://github.com/wjdp/htmltest
  • Add a .htmltest.yml in the root of the repo
  • Run ./bin/htmltest and see what it says. Either add config options to silence false positives, or fix things up.
  • Once it's passing, add it in CI.

.htmltest.yml should be something like:

DirectoryPath: "public"
EnforceHTTPS: true
IgnoreURLs:
- "example.com"
CacheExpires: "6h"
IgnoreDirectoryMissingTrailingSlash: true

This gives:

$ ./bin/htmltest htmltest started at 03:46:33 on public
========================================================================
contribute/index.html
  alt text empty --- contribute/index.html --> /images/logos/numpy.svg
  is not an HTTPS target --- contribute/index.html --> http://github.com/numpy/numpy
  empty hash --- contribute/index.html --> #
learn/index.html
  alt text empty --- learn/index.html --> /images/logos/numpy.svg
....

(many more issues)

Missing alt text is an accessiblity issue, and we should prioritize fixing that. Broken links too of course. The rest I haven't looked at in detail.

rgommers avatar May 08 '21 13:05 rgommers