ouroboros-consensus
ouroboros-consensus copied to clipboard
Check links on CI
Description
Markdown files contain a plenty of links which tend to get outdated occasionally. It's hard to notice when a certain link (local or global) becomes broken.
In the past, the xrefcheck tool was used to find broken links:
- https://github.com/IntersectMBO/ouroboros-consensus/pull/3
- https://github.com/IntersectMBO/ouroboros-consensus/pull/454
This PR automates xrefcheck running by adding it as a new GitHub Action job. It uses xrefcheck-action under the hood.
Resolves #4
In the second commit I tried to fix/ignore broken links detected by xrefcheck using my best judgement. AFAIU docs/website contains sources for https://ouroboros-consensus.cardano.intersectmbo.org/, so I used that site to check some links.
- Links in
References.md,index.mdandTechnicalReports.mddon't work on GitHub because they are invalid in the sense of Markdown, but they work fine on the website, so I didn't touch them and added a comment to makexrefcheckignore them. Glossary.mdcontains many links to anchors in the same file. These anchors work differently in Markdown and Docusarus as can be seen here. Ideally, they should be fixed to work on the resulting website, in which case all of them would be broken in the sense of Markdown. Fixing them for Docusaurus is out of scope of this PR, but since they will probably become broken in the sense of Markdown, I've instructedxrefcheckto ignore all links in that file.- There is also an anchor link in
VersioningSchemeDecision.md, but this file is not used in the resulting website, so I've fixed it according to Markdown rules.
There are also some global/absolute links that return 404 and I couldn't find how to fix them.
➥ In file docs/website/contents/for-developers/AbstractProtocol.md
bad reference (external) at src:7:3-178:
- text: "Ouroboros.Consensus.Tutorial.Simple"
- link: https://github.com/IntersectMBO/ouroboros-consensus/blob/master/ouroboros-consensus/src/tutorials/Ouroboros/Consensus/Tutorial/Simple.lhs
- anchor: -
⛂ Resource unavailable (404 Not Found)
➥ In file docs/website/contents/for-developers/AbstractProtocol.md
bad reference (external) at src:9:3-184:
- text: "Ouroboros.Consensus.Tutorial.WithEpoch"
- link: https://github.com/IntersectMBO/ouroboros-consensus/blob/master/ouroboros-consensus/src/tutorials/Ouroboros/Consensus/Tutorial/WithEpoch.lhs
- anchor: -
⛂ Resource unavailable (404 Not Found)
There are also some global/absolute links that return 404 and I couldn't find how to fix them.
Good catch! These are literate Haskell files that aren't rendered and uploaded to the website, which we actually might want to do. I'm creating a separate issue for this
EDIT: https://github.com/IntersectMBO/ouroboros-consensus/issues/903
Drive-by comment: some broken links detected here are (also) fixed in #884
@gromakovsky as discussed in https://github.com/IntersectMBO/ouroboros-consensus/issues/4#issuecomment-1408457570, would it be possible to demote the output to a warning?
as discussed in https://github.com/IntersectMBO/ouroboros-consensus/issues/4#issuecomment-1408457570, would it be possible to demote the output to a warning?
AFAIK there is no way to report "warning" status for a check, I see a feature request for it, but it's unresolved: https://github.com/orgs/community/discussions/11592
According to https://stackoverflow.com/a/75849954, it should be possible to report some warnings during a job run and finish the job as successful. I think we can do it: update xrefcheck to accept a custom format for reporting broken links and pass the format recognizable by GitHub Actions. I've created an issue for it in xrefcheck (https://github.com/serokell/xrefcheck/issues/297), but I can't promise how soon it will be done. One problem with this approach is that the job status will be green, so it will be easier to miss. But that's a trade-off.
Two other options have been proposed in the issue comment:
- Make the check not required for merging. This is already the case, this check is not required for merging.
- Make this a nightly (or weekly) check. I can do that, should be trivial.
I'm new to this repo, so please let me know how you would like to proceed:
- Do nothing since this check is not required for merging.
- Run this check on schedule. In this case do you have any preferences how often to run it? Once a day, week, etc.?
- Update it to always succeed, but report broken links as warnings recognizable by GitHub Actions. It will take some time to make the desired change in
xrefcheckand release a new version.
Drive-by comment: some broken links detected here are (also) fixed in https://github.com/IntersectMBO/ouroboros-consensus/pull/884
Right, after rebasing onto latest main I don't have any errors running xrefcheck locally. I've pushed the rebased branch.
First of all, thanks for your PR. I think having an automated link checker is a good thing to have, and xrefcheck could very well be that checker. However, xrefcheck-0.2.2 reports no errors on this branch, but xrefcheck@master does. I do have some question on that regard:
- Is it expected that a newer version will be released from
master? Many things seem to have changed since the latest release (link). - What is the fundamental reason for considering permanent redirects as errors? other tools like
markdown-link-checkseem to just check if the response status is correct. Why does xrefcheck go that extra step even for links that just add/at the end of the url? - How is it that
[email protected]does not find many permanent redirects? I'm running both versions on the same code and there are many redirects found by themasterversion that are not found by the0.2.2version.
With 0.2.2:
❯ /root/.local/bin/xrefcheck
Configuration file not found, using default config for GitHub repositories
All repository links are valid.
With master:
❯ /root/.local/bin/xrefcheck
Configuration file not found, using default config for GitHub repositories
=== Scan errors found ===
➥ In file docs/website/contents/for-developers/Glossary.md
scan error at src:1:1-31:
Unrecognised option "file" perhaps you meant <"ignore link"|"ignore paragraph"|"ignore all">
Scan errors dumped, 1 in total.
=== Invalid references found ===
➥ In file CONTRIBUTING.md
bad reference (external) at src:31:4-109:
- text: "Preflight guide"
- link: https://ouroboros-consensus.cardano.intersectmbo.org/docs/for-developers/PreflightGuide
Permanent redirect found:
-| https://ouroboros-consensus.cardano.intersectmbo.org/docs/for-developers/PreflightGuide
-> https://ouroboros-consensus.cardano.intersectmbo.org/docs/for-developers/PreflightGuide/
^-- stopped before this one
➥ In file CONTRIBUTING.md
bad reference (external) at src:32:4-96:
- text: "Glossary"
- link: https://ouroboros-consensus.cardano.intersectmbo.org/docs/for-developers/Glossary
Permanent redirect found:
-| https://ouroboros-consensus.cardano.intersectmbo.org/docs/for-developers/Glossary
-> https://ouroboros-consensus.cardano.intersectmbo.org/docs/for-developers/Glossary/
^-- stopped before this one
➥ In file CONTRIBUTING.md
bad reference (external) at src:32:4-96:
- text: "Glossary"
- link: https://ouroboros-consensus.cardano.intersectmbo.org/docs/for-developers/Glossary
Permanent redirect found:
-| https://ouroboros-consensus.cardano.intersectmbo.org/docs/for-developers/Glossary
-> https://ouroboros-consensus.cardano.intersectmbo.org/docs/for-developers/Glossary/
^-- stopped before this one
➥ In file docs/website/contents/for-developers/Glossary.md
bad reference (file-local) at src:82:24-74:
- text: "active slot coefficient"
- anchor: active-slot-coefficient
Anchor 'active-slot-coefficient' is not present, did you mean:
- active-slot-coefficient-f (header II) at src:9:1-31
➥ In file docs/website/contents/for-developers/Glossary.md
bad reference (file-local) at src:164:48-76:
- text: "Chain Growth"
- anchor: chain-growth
Anchor 'chain-growth' is not present, did you mean:
- chain-growth-property (header II) at src:78:1-25
➥ In file docs/website/contents/for-developers/Glossary.md
bad reference (file-local) at src:253:4-22:
- text: "Genesis"
- anchor: genesis
Anchor 'genesis' is not present, did you mean:
- genesis-block (header II) at src:167:1-17
- genesis-window (header II) at src:202:1-18
➥ In file docs/website/contents/for-developers/PreflightGuide.md
bad reference (external) at src:77:108-125:
- text: "Mithril client"
- link: https://mithril.network/doc
Permanent redirect found:
-| https://mithril.network/doc
-> https://mithril.network/doc/
^-- stopped before this one
➥ In file docs/website/contents/for-developers/PreflightGuide.md
bad reference (external) at src:78:8-35:
- text: "here"
- link: https://mithril.network/doc/manual/getting-started/bootstrap-cardano-node
Permanent redirect found:
-| https://mithril.network/doc/manual/getting-started/bootstrap-cardano-node
-> https://mithril.network/doc/manual/getting-started/bootstrap-cardano-node/
^-- stopped before this one
➥ In file docs/website/contents/for-developers/PreflightGuide.md
bad reference (external) at src:144:378-468:
- text: "Solana"
- link: https://docs.solana.com/running-validator/validator-reqs#hardware-recommendations
Permanent redirect found:
-| https://docs.solana.com/running-validator/validator-reqs#hardware-recommendations
-> https://docs.solanalabs.com/operations/requirements
^-- stopped before this one
➥ In file CONTRIBUTING.md
bad reference (external) at src:32:4-96:
- text: "Glossary"
- link: https://ouroboros-consensus.cardano.intersectmbo.org/docs/for-developers/Glossary
Permanent redirect found:
-| https://ouroboros-consensus.cardano.intersectmbo.org/docs/for-developers/Glossary
-> https://ouroboros-consensus.cardano.intersectmbo.org/docs/for-developers/Glossary/
^-- stopped before this one
Invalid references dumped, 10 in total.
int-index requested review from nfrisby, jasagredo, amesgen, fraser-iohk, dnadales and geo2a as code owners
For what it's worth, this review request was generated automatically by GitHub due to a rebase. I didn't mean to request another review just yet—not until I respond to the questions from the previous review.
Is it expected that a newer version will be released from master? Many things seem to have changed since the latest release.
Yes. We wanted to include a few other improvements before cutting a release, and now xrefcheck-0.3.0 is available.
What is the fundamental reason for considering permanent redirects as errors?
Two reasons:
-
For readers of a document, opening a link is more efficient if it is direct. Perhaps that alone isn't sufficient grounds to consider redirects as errors, but it makes them undesirable.
-
There are websites (e.g. GitLab) that don't follow the usual HTTP conventions and do redirects instead of responding with 404 if a document is unavailable, e.g. try https://gitlab.com/gitlab-org/gitlab/-/blob/master/asdf.
One could argue that GitLab shouldn't do that. However, since redirects are undesirable anyway, the pragmatic choice is to report them as errors.
How is it that
[email protected]does not find many permanent redirects?
The old version follows those permanent redirects instead of reporting them. Our experience of using xrefcheck with GitLab links made us reconsider that design choice, so xrefcheck-0.3.0 reports those redirects as errors.
Commits have been signed. However, 3 checks seem to be stuck 🤔
Those checks have passed, so the PR is all green now 🎉
I embarked this PR in the merge queue. Let's see if we can get it merged!
Thanks both for your contributions