linkchecker
linkchecker copied to clipboard
Anchor not found with large number of anchors
Summary
Running AnchorCheck on pages with large number of anchors will produce some false positives, even though most Anchors are found correctly and listed in available anchors.
Steps to reproduce
- Run
linkchecker --config=<(echo '[AnchorCheck]') -r 1 --verbose https://nixos.org/explore.html
Actual result
There will be large amount of “Anchor not found” warnings and the anchor is missing from the list of available anchors.
Expected result
All anchors that exist will be properly detected as present.
Environment
- Operating system: NixOS Linux
- Linkchecker version: 10.0.1
- Python version: 3.9.6
- Install method: distribution package (patched to use latest version)
- Site URL: https://nixos.org/explore.html
Configuration file
[AnchorCheck]
Logs
Other notes
- I also noticed that the list of available anchors printed in the logs is small at first but steadily increasing as the checking progresses.
- I first noticed it in a page that has most anchors inside an inline SVG document but that seems to be just a coincidence, I can reproduce in https://nixos.org/manual/nixpkgs/stable/ as well.
- Weirdly, if I download the file locally, it does not find any issue – thought it does not seem to check any links: local-check-does-nothing.txt
I think there are too many problems with AnchorCheck that we can't leave it available because there can be no confidence it will return accurate results. #575 If anyone thinks it has some use as it is I suggest submitting a PR for some kind of "I really know what I am doing/enable broken things" option (of course just patching your copy with a reversion is easier!).
The AnchorCheck plugin has been re-enabled with fixes thanks to work from Nathan Arthur. If any problems are found please report them as new issues.