cjworkbench
cjworkbench copied to clipboard
HTTP 503 when downloading CSVs
When a workflow has been changed but not yet rendered (so its Steps' cached render results don't exist or are stale), requests to GET /public/moduledata/live/:id.(csv|json)
will return HTTP 503
.
Steps to reproduce:
- Create a workflow with a "Load HTML from URL" module
- Point it to
https://www.nytimes.com
and set auto-refresh every 5min - Look up the "API endpoint" (
/public/moduledata/live/:id.csv
), and then close the browser window - Six minutes later, request data from the endpoint.
Expected results: you get new data
Actual results: HTTP 503
-- but if you retry a few seconds later, you'll get data.
The problem: Workbench renders processes in the background, and a GET
request is in the foreground. If the workflow isn't rendered, we can't know when it will render.
This plays badly with auto-refreshes: when auto-refreshing a step, if the workflow has no steps with notifications enabled and nobody has a web client open to the workflow, Workbench skips rendering altogether. (It will only render on-demand.)
The Workbench-side workaround: when we return HTTP 503
, we schedule another render of the workflow, in case it hasn't been scheduled yet.
There are two user-side workarounds:
- Enable notifications on any step in the workflow. That will force a render every time data changes -- greatly reducing the amount of time a request would lead to an
HTTP 503
response. - Configure the client to retry after 10-30s upon
HTTP 503
.
A better solution is to let users "turn on" API endpoints instead of supplying them implicitly. API endpoints should always host valid data -- even if it's stale.