ITK icon indicating copy to clipboard operation
ITK copied to clipboard

Add broken link checker

Open jhlegarreta opened this issue 2 years ago • 6 comments

Description

The ITK documentation (whether it is Markdown files or class documentation) frequently contains links to diverse websites. Some of these links may be broken. A tool (a pre-commit hook, and/or a GHA workflow) to check for broken links would prevent from including broken links, or would point out which links are broken across the code base.

Expected information

Broken links should not exist.

Actual information

Broken links may exist in the ITK documentation.

Versions

master

Additional Information

This probably applies to the ITK SWG, and maybe the ITK Sphinx Examples as well.

jhlegarreta avatar Oct 01 '23 17:10 jhlegarreta

I locally ran the W3C link checker on the resulting files of the documentation build. Quite a few warnings come from

  • the fact of permanent redirects and especially

    List of redirects
    https://itk.org/ITKExamples
    -> https://examples.itk.org/
     Line: 140
     Code: 301 -> 200 OK
    To do: This is a permanent redirect. The link should be updated.
    
    
    https://itk.org/ITKExamples/src/Core/Common/ProduceImageProgrammatically/Documentation.html
    -> https://examples.itk.org/src/core/common/produceimageprogrammatically/documentation
     Line: 143
     Code: 301 -> 200 OK
    To do: This is a permanent redirect. The link should be updated.
    

    For this are 2 solutions:

    • the problems will be fixed, in total just a few files:
       ../../../Documentation/docs/contributing/upload_binary_data.md
       ../../../Documentation/docs/releases/4.5.md
       ../../../Documentation/docs/releases/5.0.md
       ../../../Documentation/docs/releases/5.0a02.md
       ../../../Documentation/docs/releases/5.0b03.md
       ../../../Documentation/docs/releases/5.1.md
       ../../../Documentation/docs/releases/5.2.md
       ../../../Examples/README.md
       ../../../GettingStarted.md
       ../../../Modules/Remote/SphinxExamples.remote.cmake
       ../../../Utilities/Doxygen/DoxygenConfig.cmake
       ../../../Utilities/KWStyle/KWStyle.cmake
       ../../../Wrapping/DoxygenConfig.cmake
       
    • in my test link script I place an exception for this.

    Please advise

  • A message about a missing file like:

     List of broken links and other issues:
     file:///.../build/Utilities/Doxygen/html/doxygen.png
       Line: 132
       Code: 404 File `.../build/Utilities/Doxygen/html/doxygen.png' does not exist
      To do: The link is broken. Double-check that you have not made any typo,
             or mistake in copy-pasting. If the link points to a resource that
             no longer exists, you may want to remove or fix the link.
    

    The doxygen.png is used in the Documentation/Doxygen/DoxygenFooter.html and should be updated when using a newer doxygen version (same account for the DoxygenHeader .html The doxgen.png file has been replaced by doxygen.svg since doxygen version 1.8.19

  • A few messages from the css file like:

    List of broken links and other issues:
    file:///.../build/Utilities/Doxygen/html/doc.png
      Line: (N/A)
      Code: 404 File `.../build/Utilities/Doxygen/html/doc.png' does not exist
     To do: The link is broken. Double-check that you have not made any typo,
            or mistake in copy-pasting. If the link points to a resource that
            no longer exists, you may want to remove or fix the link.
    

    This is due to the fact that there is in the cmake settings file the setting:

    set(DOXYGEN_HTML_EXTRA_STYLESHEET "${ITK_SOURCE_DIR}/Documentation/Doxygen/ITKDoxygenStyle.css")
    

    which results in the doxygen settings file

    HTML_EXTRA_STYLESHEET  = .../Documentation/Doxygen/ITKDoxygenStyle.css
    

    comparing this file with the original 1.8.15 file gives that they are identical. The HTML_EXTRA_STYLESHEET is there to set extra settings, in the way it is now it overrules most of the settings and thus missing quite a few of the new features. The png files have been replaced since doxygen version 1.9.7

albert-github avatar Jan 26 '25 17:01 albert-github

:100: Thanks for doing this. Can a GHA workflow file be added to check the links using the checker that you have used? Or, can pre-commit do this job by adding some relevant task?

jhlegarreta avatar Jan 26 '25 21:01 jhlegarreta

The method I use is on my local PC and using the W3C link checker version 4.81 (see e.g. http://validator.w3.org/docs/checklink.html) and it needs the build documentation. There might be better link checkers though nowadays. It also reports, unfortunately, some double / missing doxygen links which we (doxygen developers) will have to dive into a bit in more detail.

  • Can a GHA workflow file be added to check the links using the checker that you have used?

    This would probably be possible, I have not tried / experience with it.

  • Or, can pre-commit do this job by adding some relevant task?

    as written the tool I use requires the build documentation and the link checking as is is also quite slow, so I don't think it would be suitable for a pre-commit task.

As far as I now see it might be suitable for a GitHub Action (but e.g. only once a day / week).

  • I will check the current set of remaining problems and report the ITK related link problems here (unfortunately the link checker gives the name of the html file and not the origin. Also one link problem might occur multiple times in the sources or in the html files, I will normally report it only once)
  • I also think (but this is far out of my jurisdiction) that ITK should move to newer version (best to the current release) of doxygen, at least for the ITK 6.0.0 and newer versions

albert-github avatar Jan 27 '25 10:01 albert-github

Continuation of https://github.com/InsightSoftwareConsortium/ITK/issues/4240#issuecomment-2614514448

  • permanent redirects

    List of redirects
    https://www.itk.org/
    -> https://itk.org/
      Line: 160
      Code: 301 -> 200 OK
     To do: This is a permanent redirect. The link should be updated.
    
    http://commonfund.nih.gov/bioinformatics
    -> https://commonfund.nih.gov/bioinformatics
      Line: 136
      Code: 301 -> 200 OK
     To do: This is a permanent redirect. The link should be updated.
    
    https://www.creatis.insa-lyon.fr/
    -> https://www.creatis.insa-lyon.fr/site/fr
      Line: 136
      Code: 301 -> 200 OK
     To do: This is a permanent redirect. The link should be updated.
    
    http://www.fp.ucalgary.ca/mhallbey/tutorial.htm
    -> https://prism.ucalgary.ca/handle/1880/51900
    
    http://niftilib.sourceforge.net/
    -> https://niftilib.sourceforge.net/
      Line: 161
      Code: 301 -> 200 OK
     To do: This is a permanent redirect. The link should be updated.
    

    also

    http://users.polytech.unice.fr/~lingrand/MarchingCubes/algo.html
    -> https://users.polytech.unice.fr/~lingrand/MarchingCubes/algo.html
    
    https://www.itk.org/pipermail/insight-users/2008-May/026112.html
    -> https://itk.org/pipermail/insight-users/2008-May/026112.html
    

    and more references to pipermail

    http://www.poynton.com/ColorFAQ.html
    -> https://poynton.ca/ColorFAQ.html
    
    https://midasjournal.org/browse/publication/825
    -> https://midasjournal.org/browse/publication/825/
    
    https://rules.sonarsource.com/cpp/RSPEC-6032/
    -> https://rules.sonarsource.com/cpp/rspec-6032/
    
    https://www.na-mic.org/Wiki/index.php/NAMIC_Wiki:DTI:ITK-DiffusionTensorPixelType
    -> https://www.na-mic.org/wiki/NAMIC_Wiki:DTI:ITK-DiffusionTensorPixelType
    
    http://www.openjpeg.org/
    -> https://www.openjpeg.org/
    
    http://www.libtiff.org/
    -> https://libtiff.gitlab.io/libtiff/
    
    https://code.google.com/p/double-conversion/
    -> https://github.com/google/double-conversion
    
    http://www.nitrc.org/projects/gifti/
    -> https://www.nitrc.org/projects/gifti/
    
    http://www.netlib.org/slatec/
    -> https://www.netlib.org/slatec/
    
    https://www.birncommunity.org/
    -> https://writepaperfor.me/birncommunity-org
    
    
    http://vxl.sourceforge.net/
    -> https://vxl.sourceforge.net/
    

    and more sourceforge related http -> https links

  • broken link:

    http://www.cssip.uq.edu.au/meastex/www/algs/algs/algs.html
      Line: 153
      Code: 500 Can't connect to www.cssip.uq.edu.au:80 (Name or service not known)
     To do: This is a server side problem. Check the URI.
    
    https://analyzedirect.com/support/10.0Documents/Analyze_Resource_01.pdf
      Line: 135
      Code: 404 Not Found
     To do: The link is broken. Double-check that you have not made any typo,
            or mistake in copy-pasting. If the link points to a resource that
            no longer exists, you may want to remove or fix the link.
    
    https://doi.org/10.54294/olkmog9
      Line: 132
      Code: 404 Not Found
     To do: The link is broken. Double-check that you have not made any typo,
            or mistake in copy-pasting. If the link points to a resource that
            no longer exists, you may want to remove or fix the link.
    
    https://itk.org/ITKExamples/src/Core/Common/CustomOperationToEachPixelInImage/Documentation.html
      Line: 143
      Code: 500 Server closed connection without sending any data back
     To do: This is a server side problem. Check the URI.
    

    and

    https://itk.org/ITKExamples/src/Filtering/Smoothing/ApplyMedianFilter/Documentation.html
    

    also

    https://www.cs.tut.fi/~ant/histthresh
    
    https://citeseer.ist.psu.edu/sezgin04survey.html
    
  • broken links from redirects NOTE: the double slash (//) inside the redirect

    https://itk.org/ITKExamples/src/Core/Common/RandomSelectPixelFromRegionWithoutReplace/Documentation.html
    -> https://examples.itk.org//src/Core/Common/RandomSelectPixelFromRegionWithoutReplace/Documentation.html
      Line: 217
      Code: 301 -> 404 Not Found
     To do: The link is broken. Double-check that you have not made any typo,
            or mistake in copy-pasting. If the link points to a resource that
            no longer exists, you may want to remove or fix the link.
    

    also

    https://itk.org/ITKExamples/src/Core/Common/CreateAnother/Documentation.html
    -> https://examples.itk.org//src/Core/Common/CreateAnother/Documentation.html
    
    https://itk.org/ITKExamples/src/Core/Mesh/ConvertMeshToUnstructuredGrid/Documentation.html
    -> https://examples.itk.org//src/Core/Mesh/ConvertMeshToUnstructuredGrid/Documentation.html
    
    https://itk.org/ITKExamples/src/Filtering/ImageIntensity/RescaleAnImage/Documentation.html
    -> https://examples.itk.org//src/Filtering/ImageIntensity/RescaleAnImage/Documentation.html
    
    https://itk.org/ITKExamples/src/IO/ImageBase/Creade3DFromSeriesOf2D/Documentation.html
    -> https://examples.itk.org//src/IO/ImageBase/Creade3DFromSeriesOf2D/Documentation.html
    
    https://itk.org/ITKExamples/src/Core/Common/Transparency/Documentation.html
    -> https://examples.itk.org//src/Core/Common/Transparency/Documentation.html
    
    https://itk.org/ITKExamples/src/Core/Common/OutOfBoundsPixelsReturnConstValue/Documentation.html
    -> https://examples.itk.org//src/Core/Common/OutOfBoundsPixelsReturnConstValue/Documentation.html
    
    https://itk.org/ITKExamples/src/Core/SpatialObjects/LineSpatialObject/Documentation.html
    -> https://examples.itk.org//src/Core/SpatialObjects/LineSpatialObject/Documentation.html
    
    https://itk.org/ITKExamples/src/Filtering/Smoothing/SmoothWithRecursiveGaussian/Documentation.html
    -> https://examples.itk.org//src/Filtering/Smoothing/SmoothWithRecursiveGaussian/Documentation.html
    
    https://itk.org/ITKExamples/src/Filtering/ImageFeature/FindZeroCrossings/Documentation.html
    -> https://examples.itk.org//src/Filtering/ImageFeature/FindZeroCrossings/Documentation.html
    
    https://itk.org/ITKExamples/src/Core/SpatialObjects/LineSpatialObject/Documentation.html
    -> https://examples.itk.org//src/Core/SpatialObjects/LineSpatialObject/Documentation.html
    
    https://itk.org/ITKExamples/src/Filtering/ImageLabel/ExtractBoundariesOfBlobsInBinaryImage/Documentation.html
    -> https://examples.itk.org//src/Filtering/ImageLabel/ExtractBoundariesOfBlobsInBinaryImage/Documentation.html
    
    https://itk.org/ITKExamples/src/Filtering/ImageFeature/RequestedRegion/Documentation.html
    -> https://examples.itk.org//src/Filtering/ImageFeature/RequestedRegion/Documentation.html
    
    https://itk.org/ITKExamples/src/Filtering/ImageIntensity/SetOutputPixelToMax/Documentation.html
    -> https://examples.itk.org//src/Filtering/ImageIntensity/SetOutputPixelToMax/Documentation.html
    
    https://itk.org/ITKExamples/src/Filtering/ImageIntensity/SetOutputPixelToMin/Documentation.html
    -> https://examples.itk.org//src/Filtering/ImageIntensity/SetOutputPixelToMin/Documentation.html
    
    https://itk.org/ITKExamples/src/Filtering/Smoothing/ApplyMeanFilter/Documentation.html
    -> https://examples.itk.org//src/Filtering/Smoothing/ApplyMeanFilter/Documentation.html
    
  • broken links with redirects

    https://www.itk.org/mailman/private/insight-developers/2009-February/011732.html
    -> https://itk.org/mailman/private/insight-developers/2009-February/011732.html
    
    https://www.itk.org/Wiki/Proposals:Orientation
    -> https://itk.org/Wiki/Proposals:Orientation
      Line: 1910
      Code: 301 -> 500 Internal Server Error
     To do: This is a server side problem. Check the URI.
    Fragments:
            Some_notes_on_the_DICOM_convention_and_current_ITK_usage        Line: 1910
    
    ftp://ftp.inria.fr/INRIA/tech-reports/RR/RR-1893.ps.gz
    

    see: https://inria.hal.science/inria-00074778/en/

    https://www.cs.unc.edu/~styner/docs/tmi00.pdf
    
    https://www.cs.unc.edu/~styner/docs/tmi99.pdf
    
    http://ww.vavlab.ee.boun.edu.tr/courses/574/materialx/Active%20Contours/xu_GVF.pdf
    

Some, in my eyes dangerous links and referring to old version of ITK:

  • File: Modules/Core/Common/include/itkImageBase.h
    • https://github.com/InsightSoftwareConsortium/ITK/blob/v5.3.0/Modules/Core/Common/include/itkImageToImageFilter.h#L78-L92
    • https://github.com/InsightSoftwareConsortium/ITK/blob/v5.3.0/Modules/Core/Common/src/itkImageToImageFilterCommon.cxx#L26-L27

albert-github avatar Jan 27 '25 13:01 albert-github

Regarding the HTML_EXTRA_STYLESHEET (see https://github.com/InsightSoftwareConsortium/ITK/issues/4240#issuecomment-2614514448), proposed patch: diff.patch Note: the file Documentation/Doxygen/ITKDoxygenStyle.css should be removed as well, hopefully this is clear in this proposed patch as well

albert-github avatar Jan 29 '25 10:01 albert-github

PR with this patch opened as #5199.

dzenanz avatar Jan 29 '25 18:01 dzenanz