dash-docs icon indicating copy to clipboard operation
dash-docs copied to clipboard

Dash Docs dynos crash if 404 errors occur while loading datasets

Open rpkyle opened this issue 5 years ago • 3 comments

Around 3:15pm this afternoon, attempts to reach https://dash.plot.ly resulted in a 503 error.

The cause appears to be related to the https://dashr.plot.ly outage the preceding evening: several example datasets for the Dash Bio documentation were relocated from plotly/dash-bio to plotly/dash-bio-docs-files. The URLs beginning with

https://raw.githubusercontent.com/plotly/dash-bio/master/tests/dashbio_demos/sample_data/

needed to be updated to

https://raw.githubusercontent.com/plotly/dash-bio-docs-files/master/

This may be a side effect of the changes implemented in https://github.com/plotly/dash-bio/pull/459, which refactored Python demos into standalone applications.

Some of these URLs were split across multiple lines in the source, which made a simple grep for them tricky. Once all related URLs were updated, the dynos were able to start successfully.

To avoid a repeat of this event in the future, @alexcjohnson suggested that we will likely want to consider

  • 🚁 adding local versions of these datasets
  • 🔎 extending the find_and_replace dict https://github.com/plotly/dash-docs/blob/67df7a9978807d5aac8c1ca2c6d6eadbb618d283/dash_docs/tools.py#L100
  • 🚨 a test of running the docs with external requests somehow disabled
  • 🚨 a test of everything in find_and_replace with external requests, to ensure that there really is a comparable (exact match?) dataset at the original location, so users copying and pasting the example will succeed

rpkyle avatar Jan 19 '20 05:01 rpkyle

Some extra context: these files are loaded at startup time, so moving the files did not initially cause any problems, but some time later it looks like dynos automatically rebooted and didn’t come back up due to this issue.

I might add a couple of other motivating measures:

  1. Make the docs as close to fully self contained as possible: don’t load any external dependencies at all ideally.
  2. Make missing external dependencies survivable by i.e. catching and swallowing the error and providing a degraded experience for area of the docs that rely on the missing dependencies.
  3. Always reboot dynos on Friday afternoon to ensure the chances of an uninterrupted weekend? ;)

nicolaskruchten avatar Jan 19 '20 13:01 nicolaskruchten

Most examples already run offline at boot. We do a dynamic find-and-replace to change the remote URL to the local URL so that we don't load any external resources. This was originally done so that the docs could run in an offline, airgapped environment.

This logic is here: https://github.com/plotly/dash-docs/blob/67df7a9978807d5aac8c1ca2c6d6eadbb618d283/dash_docs/tools.py#L100-L130

As part of our PR template, we have a checklist item for adding a local version of a dataset if the dataset was added: https://github.com/plotly/dash-docs/blob/master/.github/pull_request_template.md


The exception to this is the Dash Bio examples. There is an issue about this here: https://github.com/plotly/dash-docs/issues/727

The Dash Bio examples weren't converted to this offline prorgram since they aren't all CSVs and the syntax that they use to load datasets remotely might not be the same syntax to load datasets locally.

In airgapped environments, we just ignore the dash bio examples for now.

chriddyp avatar Jan 20 '20 15:01 chriddyp

Thanks for pointing to #727 @chriddyp

In airgapped environments, we just ignore the dash bio examples for now.

We presumably could do that, but this clearly doesn't happen automatically, or we wouldn't have had this weekend's failure. Anyway if that's easier than making the Dash Bio examples local we can start that way, but either way we need the tests mentioned above: docs boot offline (if nothing else so we're immune to outages elsewhere), and any replacements point to valid external resources.

alexcjohnson avatar Jan 20 '20 17:01 alexcjohnson