dask-examples icon indicating copy to clipboard operation
dask-examples copied to clipboard

Regularly check for broken links

Open hammer opened this issue 4 years ago • 8 comments

Inspired by https://github.com/dask/dask-examples/pull/151, which I encountered manually, I just ran http://examples.dask.org through a dead link checker and found a handful of broken links. I don't have time to fix them all right now but I just thought I'd drop the results here.

It might be a good idea to include a dead link check as part of the website deploy, but that may also be overkill!

Status URL Source link text
-1 Invalid URL http://127.0.0.1:8787/status http://127.0.0.1:8787/status
404 Not Found https://docs.dask.org/en/latest/bag-overview.html Dask Bag Documentation
404 Not Found https://www.continuum.io/sites/default/files/dask_stacked.png <No Text>
-1 Invalid URL http://10.20.0.141:8787/status http://10.20.0.141:8787/status
404 Not Found https://ml.dask.org/examples/xgboost.html http://ml.dask.org/examples/xgboost.html
404 Not Found https://xgboost.readthedocs.io/en/latest/python/python_intro https://xgboost.readthedocs.io/en/latest/python/python_intro
404 Not Found https://distributed.readthedocs.io/en/latest/local-cluster.html local cluster
404 Not Found https://docs.scipy.org/doc/numpy-1.16.0/reference/c-api https://docs.scipy.org/doc/numpy-1.16.0/reference/c-api
404 Not Found https://scikit-learn.org/stable/modules/scaling_strategies.html user guide [301 from http://scikit-learn.org/stable/modules/scaling_strategies.html]
404 Not Found https://numpy.org/doc/stable/reference/c-api.generalized-ufuncs.html Generalized Universal Functions [302 from https://docs.scipy.org/doc/numpy/reference/c-api.generalized-ufuncs.html]
404 Not Found https://examples.dask.org/proxy/8787/status dashboard's status page
404 Not Found https://examples.dask.org/proxy/8787/graph dashboard's graph page
404 Not Found https://examples.dask.org/applications/' Cleaning up temporary directories and files
404 Not Found https://examples.dask.org/applications/clip.gif img/src
404 Not Found https://examples.dask.org/surveys/examples.dask.org dask examples
-1 Timeout http://www.celeryproject.org/ Celery
404 Not Found https://distributed.readthedocs.io/en/latest/setup.html scale out to a cluster

hammer avatar Jun 04 '20 18:06 hammer

cc @dask/maintenance

mrocklin avatar Jun 08 '20 15:06 mrocklin

Note that all the ones with 8787 are clearly not meant to exist, they would refer to a running scheduler. Adding a link checker isn't a bad idea, but I wouldn't require success it for a PR to pass.

martindurant avatar Jun 08 '20 15:06 martindurant

Yeah, I'm more concerned with fixing the links pointing to old or stale documentation. I agree that many of these probably came from docs that referred to addreses generally.

mrocklin avatar Jun 08 '20 15:06 mrocklin

Yeah the built docs have the same issue. I agree fixing is more important than adding a CI check.

jsignell avatar Jun 08 '20 15:06 jsignell

Especially since it will always be external changes that would cause a failure.

jsignell avatar Jun 08 '20 15:06 jsignell

Well, I think that CI would also be grand, if only to make us aware of failures as they arise due to external changes. I'll take what I can get though. Fixing exisitng links, or adding redirects upstream (see docs/conf.py in most repositories) should be an easy fix for most folks.

mrocklin avatar Jun 08 '20 16:06 mrocklin

I would like to take this up if anyone else hasn't already. I can start by manually fixing the URLs listed in the issue.

pratyakshajha avatar Oct 03 '20 12:10 pratyakshajha

That would be welcome, thanks @pratyakshajha!

jrbourbeau avatar Oct 03 '20 15:10 jrbourbeau