Add broken link checker
Description
The ITK documentation (whether it is Markdown files or class documentation) frequently contains links to diverse websites. Some of these links may be broken. A tool (a pre-commit hook, and/or a GHA workflow) to check for broken links would prevent from including broken links, or would point out which links are broken across the code base.
Expected information
Broken links should not exist.
Actual information
Broken links may exist in the ITK documentation.
Versions
master
Additional Information
This probably applies to the ITK SWG, and maybe the ITK Sphinx Examples as well.
I locally ran the W3C link checker on the resulting files of the documentation build. Quite a few warnings come from
-
the fact of permanent redirects and especially
List of redirects https://itk.org/ITKExamples -> https://examples.itk.org/ Line: 140 Code: 301 -> 200 OK To do: This is a permanent redirect. The link should be updated. https://itk.org/ITKExamples/src/Core/Common/ProduceImageProgrammatically/Documentation.html -> https://examples.itk.org/src/core/common/produceimageprogrammatically/documentation Line: 143 Code: 301 -> 200 OK To do: This is a permanent redirect. The link should be updated.For this are 2 solutions:
- the problems will be fixed, in total just a few files:
../../../Documentation/docs/contributing/upload_binary_data.md ../../../Documentation/docs/releases/4.5.md ../../../Documentation/docs/releases/5.0.md ../../../Documentation/docs/releases/5.0a02.md ../../../Documentation/docs/releases/5.0b03.md ../../../Documentation/docs/releases/5.1.md ../../../Documentation/docs/releases/5.2.md ../../../Examples/README.md ../../../GettingStarted.md ../../../Modules/Remote/SphinxExamples.remote.cmake ../../../Utilities/Doxygen/DoxygenConfig.cmake ../../../Utilities/KWStyle/KWStyle.cmake ../../../Wrapping/DoxygenConfig.cmake
- in my test link script I place an exception for this.
Please advise
- the problems will be fixed, in total just a few files:
-
A message about a missing file like:
List of broken links and other issues: file:///.../build/Utilities/Doxygen/html/doxygen.png Line: 132 Code: 404 File `.../build/Utilities/Doxygen/html/doxygen.png' does not exist To do: The link is broken. Double-check that you have not made any typo, or mistake in copy-pasting. If the link points to a resource that no longer exists, you may want to remove or fix the link.The
doxygen.pngis used in theDocumentation/Doxygen/DoxygenFooter.htmland should be updated when using a newer doxygen version (same account for theDoxygenHeader .htmlThedoxgen.pngfile has been replaced bydoxygen.svgsince doxygen version 1.8.19 -
A few messages from the
cssfile like:List of broken links and other issues: file:///.../build/Utilities/Doxygen/html/doc.png Line: (N/A) Code: 404 File `.../build/Utilities/Doxygen/html/doc.png' does not exist To do: The link is broken. Double-check that you have not made any typo, or mistake in copy-pasting. If the link points to a resource that no longer exists, you may want to remove or fix the link.This is due to the fact that there is in the cmake settings file the setting:
set(DOXYGEN_HTML_EXTRA_STYLESHEET "${ITK_SOURCE_DIR}/Documentation/Doxygen/ITKDoxygenStyle.css")which results in the doxygen settings file
HTML_EXTRA_STYLESHEET = .../Documentation/Doxygen/ITKDoxygenStyle.csscomparing this file with the original 1.8.15 file gives that they are identical. The
HTML_EXTRA_STYLESHEETis there to set extra settings, in the way it is now it overrules most of the settings and thus missing quite a few of the new features. Thepngfiles have been replaced since doxygen version 1.9.7
:100: Thanks for doing this. Can a GHA workflow file be added to check the links using the checker that you have used? Or, can pre-commit do this job by adding some relevant task?
The method I use is on my local PC and using the W3C link checker version 4.81 (see e.g. http://validator.w3.org/docs/checklink.html) and it needs the build documentation. There might be better link checkers though nowadays. It also reports, unfortunately, some double / missing doxygen links which we (doxygen developers) will have to dive into a bit in more detail.
-
Can a GHA workflow file be added to check the links using the checker that you have used?
This would probably be possible, I have not tried / experience with it.
-
Or, can pre-commit do this job by adding some relevant task?
as written the tool I use requires the build documentation and the link checking as is is also quite slow, so I don't think it would be suitable for a pre-commit task.
As far as I now see it might be suitable for a GitHub Action (but e.g. only once a day / week).
- I will check the current set of remaining problems and report the ITK related link problems here (unfortunately the link checker gives the name of the html file and not the origin. Also one link problem might occur multiple times in the sources or in the html files, I will normally report it only once)
- I also think (but this is far out of my jurisdiction) that ITK should move to newer version (best to the current release) of doxygen, at least for the ITK 6.0.0 and newer versions
Continuation of https://github.com/InsightSoftwareConsortium/ITK/issues/4240#issuecomment-2614514448
-
permanent redirects
List of redirects https://www.itk.org/ -> https://itk.org/ Line: 160 Code: 301 -> 200 OK To do: This is a permanent redirect. The link should be updated.http://commonfund.nih.gov/bioinformatics -> https://commonfund.nih.gov/bioinformatics Line: 136 Code: 301 -> 200 OK To do: This is a permanent redirect. The link should be updated.https://www.creatis.insa-lyon.fr/ -> https://www.creatis.insa-lyon.fr/site/fr Line: 136 Code: 301 -> 200 OK To do: This is a permanent redirect. The link should be updated.http://www.fp.ucalgary.ca/mhallbey/tutorial.htm -> https://prism.ucalgary.ca/handle/1880/51900http://niftilib.sourceforge.net/ -> https://niftilib.sourceforge.net/ Line: 161 Code: 301 -> 200 OK To do: This is a permanent redirect. The link should be updated.also
http://users.polytech.unice.fr/~lingrand/MarchingCubes/algo.html -> https://users.polytech.unice.fr/~lingrand/MarchingCubes/algo.htmlhttps://www.itk.org/pipermail/insight-users/2008-May/026112.html -> https://itk.org/pipermail/insight-users/2008-May/026112.htmland more references to pipermail
http://www.poynton.com/ColorFAQ.html -> https://poynton.ca/ColorFAQ.html https://midasjournal.org/browse/publication/825 -> https://midasjournal.org/browse/publication/825/ https://rules.sonarsource.com/cpp/RSPEC-6032/ -> https://rules.sonarsource.com/cpp/rspec-6032/ https://www.na-mic.org/Wiki/index.php/NAMIC_Wiki:DTI:ITK-DiffusionTensorPixelType -> https://www.na-mic.org/wiki/NAMIC_Wiki:DTI:ITK-DiffusionTensorPixelType http://www.openjpeg.org/ -> https://www.openjpeg.org/ http://www.libtiff.org/ -> https://libtiff.gitlab.io/libtiff/ https://code.google.com/p/double-conversion/ -> https://github.com/google/double-conversion http://www.nitrc.org/projects/gifti/ -> https://www.nitrc.org/projects/gifti/ http://www.netlib.org/slatec/ -> https://www.netlib.org/slatec/ https://www.birncommunity.org/ -> https://writepaperfor.me/birncommunity-orghttp://vxl.sourceforge.net/ -> https://vxl.sourceforge.net/and more sourceforge related http -> https links
-
broken link:
http://www.cssip.uq.edu.au/meastex/www/algs/algs/algs.html Line: 153 Code: 500 Can't connect to www.cssip.uq.edu.au:80 (Name or service not known) To do: This is a server side problem. Check the URI.https://analyzedirect.com/support/10.0Documents/Analyze_Resource_01.pdf Line: 135 Code: 404 Not Found To do: The link is broken. Double-check that you have not made any typo, or mistake in copy-pasting. If the link points to a resource that no longer exists, you may want to remove or fix the link.https://doi.org/10.54294/olkmog9 Line: 132 Code: 404 Not Found To do: The link is broken. Double-check that you have not made any typo, or mistake in copy-pasting. If the link points to a resource that no longer exists, you may want to remove or fix the link.https://itk.org/ITKExamples/src/Core/Common/CustomOperationToEachPixelInImage/Documentation.html Line: 143 Code: 500 Server closed connection without sending any data back To do: This is a server side problem. Check the URI.and
https://itk.org/ITKExamples/src/Filtering/Smoothing/ApplyMedianFilter/Documentation.htmlalso
https://www.cs.tut.fi/~ant/histthresh https://citeseer.ist.psu.edu/sezgin04survey.html -
broken links from redirects NOTE: the double slash (
//) inside the redirecthttps://itk.org/ITKExamples/src/Core/Common/RandomSelectPixelFromRegionWithoutReplace/Documentation.html -> https://examples.itk.org//src/Core/Common/RandomSelectPixelFromRegionWithoutReplace/Documentation.html Line: 217 Code: 301 -> 404 Not Found To do: The link is broken. Double-check that you have not made any typo, or mistake in copy-pasting. If the link points to a resource that no longer exists, you may want to remove or fix the link.also
https://itk.org/ITKExamples/src/Core/Common/CreateAnother/Documentation.html -> https://examples.itk.org//src/Core/Common/CreateAnother/Documentation.html https://itk.org/ITKExamples/src/Core/Mesh/ConvertMeshToUnstructuredGrid/Documentation.html -> https://examples.itk.org//src/Core/Mesh/ConvertMeshToUnstructuredGrid/Documentation.html https://itk.org/ITKExamples/src/Filtering/ImageIntensity/RescaleAnImage/Documentation.html -> https://examples.itk.org//src/Filtering/ImageIntensity/RescaleAnImage/Documentation.html https://itk.org/ITKExamples/src/IO/ImageBase/Creade3DFromSeriesOf2D/Documentation.html -> https://examples.itk.org//src/IO/ImageBase/Creade3DFromSeriesOf2D/Documentation.html https://itk.org/ITKExamples/src/Core/Common/Transparency/Documentation.html -> https://examples.itk.org//src/Core/Common/Transparency/Documentation.html https://itk.org/ITKExamples/src/Core/Common/OutOfBoundsPixelsReturnConstValue/Documentation.html -> https://examples.itk.org//src/Core/Common/OutOfBoundsPixelsReturnConstValue/Documentation.html https://itk.org/ITKExamples/src/Core/SpatialObjects/LineSpatialObject/Documentation.html -> https://examples.itk.org//src/Core/SpatialObjects/LineSpatialObject/Documentation.html https://itk.org/ITKExamples/src/Filtering/Smoothing/SmoothWithRecursiveGaussian/Documentation.html -> https://examples.itk.org//src/Filtering/Smoothing/SmoothWithRecursiveGaussian/Documentation.html https://itk.org/ITKExamples/src/Filtering/ImageFeature/FindZeroCrossings/Documentation.html -> https://examples.itk.org//src/Filtering/ImageFeature/FindZeroCrossings/Documentation.html https://itk.org/ITKExamples/src/Core/SpatialObjects/LineSpatialObject/Documentation.html -> https://examples.itk.org//src/Core/SpatialObjects/LineSpatialObject/Documentation.html https://itk.org/ITKExamples/src/Filtering/ImageLabel/ExtractBoundariesOfBlobsInBinaryImage/Documentation.html -> https://examples.itk.org//src/Filtering/ImageLabel/ExtractBoundariesOfBlobsInBinaryImage/Documentation.html https://itk.org/ITKExamples/src/Filtering/ImageFeature/RequestedRegion/Documentation.html -> https://examples.itk.org//src/Filtering/ImageFeature/RequestedRegion/Documentation.html https://itk.org/ITKExamples/src/Filtering/ImageIntensity/SetOutputPixelToMax/Documentation.html -> https://examples.itk.org//src/Filtering/ImageIntensity/SetOutputPixelToMax/Documentation.html https://itk.org/ITKExamples/src/Filtering/ImageIntensity/SetOutputPixelToMin/Documentation.html -> https://examples.itk.org//src/Filtering/ImageIntensity/SetOutputPixelToMin/Documentation.html https://itk.org/ITKExamples/src/Filtering/Smoothing/ApplyMeanFilter/Documentation.html -> https://examples.itk.org//src/Filtering/Smoothing/ApplyMeanFilter/Documentation.html -
broken links with redirects
https://www.itk.org/mailman/private/insight-developers/2009-February/011732.html -> https://itk.org/mailman/private/insight-developers/2009-February/011732.htmlhttps://www.itk.org/Wiki/Proposals:Orientation -> https://itk.org/Wiki/Proposals:Orientation Line: 1910 Code: 301 -> 500 Internal Server Error To do: This is a server side problem. Check the URI. Fragments: Some_notes_on_the_DICOM_convention_and_current_ITK_usage Line: 1910ftp://ftp.inria.fr/INRIA/tech-reports/RR/RR-1893.ps.gzsee: https://inria.hal.science/inria-00074778/en/
https://www.cs.unc.edu/~styner/docs/tmi00.pdf https://www.cs.unc.edu/~styner/docs/tmi99.pdf http://ww.vavlab.ee.boun.edu.tr/courses/574/materialx/Active%20Contours/xu_GVF.pdf
Some, in my eyes dangerous links and referring to old version of ITK:
- File: Modules/Core/Common/include/itkImageBase.h
https://github.com/InsightSoftwareConsortium/ITK/blob/v5.3.0/Modules/Core/Common/include/itkImageToImageFilter.h#L78-L92https://github.com/InsightSoftwareConsortium/ITK/blob/v5.3.0/Modules/Core/Common/src/itkImageToImageFilterCommon.cxx#L26-L27
Regarding the HTML_EXTRA_STYLESHEET (see https://github.com/InsightSoftwareConsortium/ITK/issues/4240#issuecomment-2614514448), proposed patch: diff.patch
Note: the file Documentation/Doxygen/ITKDoxygenStyle.css should be removed as well, hopefully this is clear in this proposed patch as well
PR with this patch opened as #5199.