website2022 icon indicating copy to clipboard operation
website2022 copied to clipboard

Cloud access for section of earthdata library notebook no longer working

Open scottyhq opened this issue 3 years ago • 5 comments

This code used to return "5" but now results in a traceback. Is the Cloud access no longer an option @mikala-nsidc @betolink ?

https://icesat-2.hackweek.io/tutorials/data_access/data_access_2_earthdata.html#cloud-access

# We can create a collections object from our query.

collections = Query.fields(['ShortName','Abstract']).get()

print(len(collections))
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
File /srv/conda/envs/notebook/lib/python3.9/site-packages/earthdata/search.py:142, in DataCollections.get(self, limit)
    141 try:
--> 142     response.raise_for_status()
    143 except exceptions.HTTPError as ex:

File /srv/conda/envs/notebook/lib/python3.9/site-packages/requests/models.py:960, in Response.raise_for_status(self)
    959 if http_error_msg:
--> 960     raise HTTPError(http_error_msg, response=self)

HTTPError: 401 Client Error: Unauthorized for url: https://cmr.earthdata.nasa.gov/search/collections.umm_json?has_granules=true&include_granule_counts=true&keyword=land%20ice&bounding_box=-134.7,58.9,-133.9,59.2&provider=NSIDC_CPRD&&page_size=200&page_num=1

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
Input In [18], in <cell line: 3>()
      1 # We can create a collections object from our query.
----> 3 collections = Query.fields(['ShortName','Abstract']).get()
      5 print(len(collections))

File /srv/conda/envs/notebook/lib/python3.9/site-packages/earthdata/search.py:144, in DataCollections.get(self, limit)
    142     response.raise_for_status()
    143 except exceptions.HTTPError as ex:
--> 144     raise RuntimeError(ex.response.text)
    146 if self._format == "json":
    147     latest = response.json()["feed"]["entry"]

RuntimeError: {"errors":["Token [Bearer EDLXXX2b1b2] has expired. Note the token value has been partially redacted."]}

scottyhq avatar Apr 14 '22 23:04 scottyhq

This is kind of expected, datasets under an ACL need to be accessed with tokens that are valid for (technically) 3 months. There is an issue with CMR and currently their validity is for ~2 weeks (as far as I remember). I see 2 potential solutions, one is to regenerate the token used for this notebook(https://urs.earthdata.nasa.gov/users/<user>/user_tokens ). Another more lasting solution would be for earthdata to recreate the user's token if we get a 401. I can work on that and a fix should be ready early next week.

betolink avatar Apr 15 '22 01:04 betolink

Thanks for this information @betolink ! It would be great to document how this works over in earthdata. I'm surprised that a program like earthdata can create or refresh a token under https://urs.earthdata.nasa.gov/users/<user>/user_tokens? I guess I'm thinking of these in the same way of github personal access tokens that a user must create with a specific expiry time.

The only code dealing with authentication in the case of this notebook is auth = Auth().login(strategy='netrc') which I would I assumed passes the URS username and password in the ~/.netrc with each request to a NASA server rather than doing anything with tokens?

scottyhq avatar Apr 15 '22 18:04 scottyhq

On top of the regular credentials, some queries to CMR for collections under ACLs (like IS2 in the cloud) require a token in the request header. There is an API for these tokens that earthdata is only using for retrieving them and it can also be used to generate them... I'm going to work on automating this.

There should be a fix for this and a working readthedocs page next week! @scottyhq

betolink avatar Apr 15 '22 21:04 betolink

Hi @scottyhq I just released a new version for earthdata v0.3.1 and updated the access notebook, I plan to open a PR and see if that fixes things. Since it seems like the libraries are not pinned in the main environment file I suppose your CI can rebuild the conda-lock files so it should update the version automagically right?

betolink avatar Apr 28 '22 18:04 betolink

Some conda-forge builds need to be updated, working on it.

betolink avatar Apr 28 '22 21:04 betolink