Run html-validate on docs/ to find problems to fix?
Examples of things I see
- Prefer to use the native
- Trailing whitespaces
- Use of conditional comments are deprecated (in the head, something about IE)
Maybe some of those are worth fixing.
https://www.npmjs.com/package/html-validate
IMO this is only worth doing if we can automate, preferably with an R package.
From https://github.com/quarto-dev/quarto-cli/discussions/7489#discussioncomment-7490051 I found https://github.com/validator/validator/releases/tag/latest which could be used in a GHA workflow. Not an R package :sweat_smile:
Maybe we could just make it part of the release process? i.e. run vnu on pkgdown's own site?
yes! I'd need to run it on the current website, to see what it returns. :sweat_smile:
A lot of repeated warnings like this:
"file:/Users/hadleywickham/Documents/devtools/pkgdown/docs/dev/reference/build_site.html":2.140-2.161: error: A document must not include both a “meta” element with an “http-equiv” attribute whose value is “content-type”, and a “meta” element with a “charset” attribute.
"file:/Users/hadleywickham/Documents/devtools/pkgdown/docs/dev/reference/build_site.html":270.1-270.17: error: Duplicate ID “cb1-3”.
I can't figure out where either of those is generated 😭
Looks like libxml2 automatically adds the http-equiv meta tag 😬 — oh but looks like I can strip it off after the fact. Oh but then libxml2 just adds it again
Reading the docs for read_html() reminded me of https://github.com/r-lib/xml2/issues/432 (I nearly opened a duplicate issue :sweat_smile:)
https://gitlab.gnome.org/GNOME/libxml2/-/issues/211
https://gitlab.gnome.org/GNOME/libxml2/-/issues/507
Having tried html-validate, it produces vastly superior output to vnu. I've fixed the most obvious low-hanging fruit and I think that's probably sufficient. I don't think running it regularly is going to be terribly useful as there a few false positives or things that just aren't going to be worth fixing given the potential gain.