twitcher icon indicating copy to clipboard operation
twitcher copied to clipboard

Unable to access protected datasets on PAVICS

Open tlogan2000 opened this issue 1 year ago • 26 comments

Describe the bug

Recent updates to twitcher (or possibly magpie) break previous workflows for accessing group protected datasets on PAVICS. Code will successfully navigate thredds and create xarray datasets but will fail when trying to actually load data

Devs who wish to debug this can let me know so I can add their pavics username to the appropiriate group

To Reproduce

to run directly on PAVICS see example notebook: https://pavics.ouranos.ca/jupyter/hub/user-redirect/lab/tree/public/logan-public/Tests/THREDDS_Issues_March2023/Access-ESPO-G6_PAVICS_BROKEN.ipynb

from siphon.http_util import session_manager
from siphon.catalog import TDSCatalog
import requests
from requests_magpie import MagpieAuth, MagpieAuthenticationError
from pathlib import Path
import xarray as xr
from dask.distributed import Client
from dask.diagnostics import ProgressBar
import shutil

# Preliminary ESPO-G6 v1
cat_url =  "https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/catalog/datasets/simulations/bias_adjusted/cmip6/ouranos/ESPO-G/ESPO-G6v1.0.0/catalog.xml"  

# Preliminary ESPO-R5 v1
#cat_url = "https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/catalog/datasets/simulations/bias_adjusted/cmip5/ouranos/ESPO-R/ESPO-R5v1.0.0/catalog.xml"  

# the PAVICS_credentials.txt file is a simple text file with the user on the first line and the password on the second
# These are the same as for jupyterhub
with Path('/notebook_dir/writable-workspace/PAVICS_credentials.txt').open() as f:
    AUTH_USR = f.readline().strip()
    AUTH_PWD = f.readline().strip()

# Authentification object that will request and cache the authentification cookies
auth = MagpieAuth(f"https://pavics.ouranos.ca/magpie/", AUTH_USR, AUTH_PWD)
# A full session, within which to make opendap requests.
session = requests.Session()
session.auth = auth


# Pass the authentification manager
session_manager.set_session_options(auth=auth)

cat = TDSCatalog(cat_url)

ds = xr.open_dataset(cat.datasets[0].access_urls["OPENDAP"], chunks=dict(time=1460, lat=50, lon=50), decode_timedelta=False, session=session, engine='pydap', user_charset='utf8')
display(ds)

# loading data fails
with ProgressBar():
    ds.isel(time=0).tasmin.plot()

Expected behavior

access data

tlogan2000 avatar Mar 13 '23 14:03 tlogan2000