pymc-examples icon indicating copy to clipboard operation
pymc-examples copied to clipboard

Broken Hyperlinks on Website

Open hectormz opened this issue 3 years ago • 7 comments

I ran https://docs.pymc.io/ through a dead link checker and found a number of broken links on the website. Some should be fixed from the pymc-examples repo and some are in the main pymc3 repo, so I'm posting in both. Some hyperlinks just need to be updated (and images to bayesian book covers), but others may point to entities or resources that don't exist anymore.

This doesn't seem to be exhaustive because the link from the GLM Logistic Regression by @springcoil and @jbencook to @jbencook 's original blog post is definitely broken

Error URL Anchor Text Linked From
-1 Not found: The server name or address could not be resolved http://biostat.mc.vanderbilt.edu/wiki/Main/ChrisFonnesbeck Chris Fonnesbeck https://docs.pymc.io/notebooks/GLM-hierarchical.html
-1 Not found: A connection with the server could not be established https://sphinx-doc.org/ Sphinx https://docs.pymc.io/notebooks/GLM-hierarchical.html
403 Forbidden http://f.cl.ly/items/0R1W063h1h0W2M2C0S3M/Screen Shot 2013-10-10 at 8.22.21 AM.png img/src https://docs.pymc.io/notebooks/GLM-hierarchical.html
403 Forbidden http://f.cl.ly/items/38020n2t2Y2b1p3t0B0e/Screen Shot 2013-10-10 at 8.23.36 AM.png img/src https://docs.pymc.io/notebooks/GLM-hierarchical.html
403 Forbidden http://f.cl.ly/items/1B3U223i002y3V2W3r0W/Screen Shot 2013-10-10 at 8.25.05 AM.png img/src https://docs.pymc.io/notebooks/GLM-hierarchical.html
404 Not Found https://docs.pymc.io/_static/semantic-sphinx.css link/href https://docs.pymc.io/notebooks/GLM-hierarchical.html
404 Not Found https://docs.pymc.io/_static/default.css link/href https://docs.pymc.io/notebooks/GLM-hierarchical.html
-1 Not found: The host name in the certificate is invalid or does not match https://www.stat.columbia.edu/~gelman/book/ Book website https://docs.pymc.io/learn.html
403 Forbidden https://lh5.googleusercontent.com/Ms2ssellxl7cM6OEL_kpiKRojcj2E4ZaUWDXOa8zEwi-v9orJGYuhjczbwFSDJNsEb_ruiwtCJONNjoo7T1c7qorZm3LsAnroMAm4S5WzNT_PVqWz9aE=w1280 img/src https://docs.pymc.io/learn.html
530 https://quantopian.com/   https://docs.pymc.io/
999 Non-standard https://www.linkedin.com/pub/danne-elbers/69/3a2/7ba Danne Elbers [301 from http://www.linkedin.com/pub/danne-elbers/69/3a2/7ba] https://docs.pymc.io/notebooks/GLM-hierarchical.html
-1 Timeout http://deeplearning.net/software/theano/ Theano https://docs.pymc.io/developer_guide.html
404 Not Found https://cdasr.mclean.harvard.edu/about-us/current-lab-members/14-faculty/62-daniel-dillon Dan Dillon [301 from http://cdasr.mclean.harvard.edu/index.php/about-us/current-lab-members/14-faculty/62-daniel-dillon] [301 from https://cdasr.mclean.harvard.edu/index.php/about-us/current-lab-members/14-faculty/62-daniel-dillon] https://docs.pymc.io/notebooks/GLM-hierarchical.html
404 Not Found https://github.com/pymc-devs/pymc3/blob/master/docs/source/notebooks/weibull_aft.ipynb this example notebook https://docs.pymc.io/api/bounds.html
403 Forbidden http://bayesiandeeplearning.org/papers/BDL_21.pdf http://bayesiandeeplearning.org/papers/BDL_21.pdf https://docs.pymc.io/api/inference.html
404 Not Found https://arviz-devs.github.io/arviz/notebooks/Introduction.html https://arviz-devs.github.io/arviz/notebooks/Introduction.html https://docs.pymc.io/api/data.html
404 Not Found https://numpy.org/doc/stable/neps/npy-format.html https://docs.scipy.org/doc/numpy/neps/npy-format.html [301 from https://docs.scipy.org/doc/numpy/neps/npy-format.html] https://docs.pymc.io/api/backends.html
404 Not Found https://docs.pymc.io/pymc-examples/examples/api/distributions.rst pymc3.distributions https://docs.pymc.io/pymc-examples/examples/pymc3_howto/api_quickstart.html
404 Not Found https://docs.pymc.io/pymc-examples/examples/api/distributions/continuous.rst continuous https://docs.pymc.io/pymc-examples/examples/pymc3_howto/api_quickstart.html
404 Not Found https://docs.pymc.io/pymc-examples/examples/api/distributions/discrete.rst discrete https://docs.pymc.io/pymc-examples/examples/pymc3_howto/api_quickstart.html
404 Not Found https://docs.pymc.io/pymc-examples/examples/api/distributions/timeseries.rst timeseries https://docs.pymc.io/pymc-examples/examples/pymc3_howto/api_quickstart.html
404 Not Found https://docs.pymc.io/pymc-examples/examples/api/distributions/mixture.rst mixture https://docs.pymc.io/pymc-examples/examples/pymc3_howto/api_quickstart.html
404 Not Found https://docs.pymc.io/pymc-examples/examples/pymc3_howto/sampler-stats.ipynb here https://docs.pymc.io/pymc-examples/examples/pymc3_howto/api_quickstart.html
404 Not Found https://docs.pymc.io/pymc-examples/examples/pymc3_howto/Diagnosing_biased_Inference_with_Divergences.ipynb here https://docs.pymc.io/pymc-examples/examples/pymc3_howto/api_quickstart.html
-1 Timeout http://deeplearning.net/software/theano/library/tensor/index.html the theano api docs https://docs.pymc.io/PyMC3_and_Theano.html
404 Not Found https://docs.pymc.io/prob_dists.html custom distributions https://docs.pymc.io/notebooks/blackbox_external_likelihood.html
404 Not Found https://docs.pymc.io/advanced_theano.html custom Theano Op https://docs.pymc.io/notebooks/blackbox_external_likelihood.html
-1 Timeout http://deeplearning.net/software/theano/extending/extending_theano.html Theano Op https://docs.pymc.io/notebooks/blackbox_external_likelihood.html
-1 Timeout http://deeplearning.net/software/theano/extending/op.html grad() method https://docs.pymc.io/notebooks/blackbox_external_likelihood.html
-1 Timeout http://deeplearning.net/software/theano/library/compile/function.html Theano function https://docs.pymc.io/notebooks/blackbox_external_likelihood.html
-1 Timeout http://deeplearning.net/software/theano/library/gradient.html Theano tensor gradient https://docs.pymc.io/notebooks/blackbox_external_likelihood.html
404 Not Found http://docs.pymc.io/examples.html these examples [301 from http://pymc-devs.github.io/pymc3/examples.html] https://docs.pymc.io/pymc-examples/examples/pymc3_howto/api_quickstart.html
-1 Not found: The server name or address could not be resolved http://5047-presscdn.pagely.netdna-cdn.com/wp-content/uploads/2015/04/iris_petal_sepal.png img/src https://docs.pymc.io/pymc-examples/examples/variational_inference/variational_api_quickstart.html
-1 Timeout http://deeplearning.net/software/theano/library/tensor/basic.html an overview of the available types https://docs.pymc.io/notebooks/getting_started.html
404 Not Found https://github.com/pymc-devs/pymc3/blob/master/pymc3/examples/disaster_model_theano_op.py a more elaborate example of the usage of as_op https://docs.pymc.io/notebooks/getting_started.html
-1 Timeout http://deeplearning.net/software/theano/extending/index.html documentation, https://docs.pymc.io/Advanced_usage_of_Theano_in_PyMC3.html

hectormz avatar May 27 '21 02:05 hectormz

Thanks! I have added a note on this at https://github.com/pymc-devs/pymc-examples/wiki/Notebook-updates-overview. You mention a dead link checker, do you know if it could be added as a step to our CI?

It currently checks format, execution order and a couple other things and it would be great to also check that. Especially given that sometimes reviewnb doesn't render the links correctly. i.e. the link may not work for me when reviewing even when the link does work locally and once rendered on the website.

OriolAbril avatar May 27 '21 16:05 OriolAbril

@OriolAbril I had just used https://www.deadlinkchecker.com, which seems a little stochastic and not practically useful here. Would you want to use something in pre-commit and/or a github action?

hectormz avatar May 28 '21 01:05 hectormz

I found pytest-check-links which works on:

  • .html
  • .rst
  • .md
  • .ipynb (requires nbconvert)

One consideration is if you want to check the source files for broken links, or check resulting files/website itself

hectormz avatar May 28 '21 17:05 hectormz

I think checking the source files is fine, maybe even better. There will be links that are sphinx generated and therefore won't be checked in this approach, but these rely on intersphinx which should be quite robust and error out when building the docs if there is an issue.

OriolAbril avatar Jun 01 '21 18:06 OriolAbril

Maybe pytest-check-links is worth checking out then. Depending on how long it takes, it can be done by CI for PR, etc

hectormz avatar Jun 09 '21 19:06 hectormz

I don't know if this deserves a (separate) issue, since its very minor. Yet, I want to mention it at least.
In https://www.pymc.io/projects/examples/en/latest/gallery.html the card for "GLM: Linear regression" does not include a hyperlink. Since the link is set as "pymc:GLM_linear" in gallery.rst, it could just be related to this capitalization issue in sphinx: https://github.com/sphinx-doc/sphinx/issues/8982

chkunkel avatar Nov 09 '22 05:11 chkunkel

Thanks, it looks like we should update the target and reference to be both lowercase only as the issue is a bit old already and doesn't look active. Do you want to send the PRs for this?

The target is defined in https://github.com/pymc-devs/pymc/blob/main/docs/source/learn/core_notebooks/GLM_linear.ipynb (top of the file)

OriolAbril avatar Nov 09 '22 17:11 OriolAbril