twitcher icon indicating copy to clipboard operation
twitcher copied to clipboard

Intermittant thredds read errors behind twitcher

Open tlogan2000 opened this issue 1 year ago • 18 comments

Describe the bug

We have noticed intermittant netcdf read errors when accessing opendap links in the PAVICS jupyterhub. Approximate date when the problem began ~ March 01 2023

To Reproduce

The intermittant nature of this problem makes reproducing somewhat difficult but a public notebook on the PAVICS server is available here : https://pavics.ouranos.ca/jupyter/hub/user-redirect/lab/tree/public/logan-public/Tests/THREDDS_Issues_March2023/Random_Thredds_read_errors.ipynb

The notebook will execute a relatively large workflow and uses multiple dask worker processes to accentuate the possibility of a read error

Multiple notebook runs (note only ~5-6 due to time needed) have shown that bypassing twitcher (i.e. thredds with nginx proxy only) always allows successful completion of the calculations whereas accessing opendap links behind the nginx/twitcher combinations typically results in a read error relatively quickly in the workflow.

Although not quantified there also seems to be a general performance hit when accessing data via nginx/twitcher (print outs of execution times in the workflow loop between 25-40 sec with nginx/twitcher versus 18-30 seconds with nginx only). Note also, that the notebook runs 'nginx-only' code first so I do not believe the performance difference is benefiting from caching of data or if it is should benefit the 'twitcher/nginx' run.

Expected behavior

Execution of code without read error

  • OS: PAVICS jupyterlab (linux)

tlogan2000 avatar Apr 17 '23 20:04 tlogan2000