isce2 icon indicating copy to clipboard operation
isce2 copied to clipboard

Update fetchOrbit.py for new Copernicus data source

Open rtburns-jpl opened this issue 2 years ago • 14 comments

All kudos go to Scott for already figuring out how to do this using python requests, super easy to port this over from sentineleof vs. trying to translate the curl calls from the Copernicus docs.

Fixes https://github.com/isce-framework/isce2/issues/792

rtburns-jpl avatar Dec 28 '23 22:12 rtburns-jpl

Hey @rtburns-jpl,

Thanks for providing the fix. Do you have an ETA for the merge to the main branch?

All the best for the new year!

toro-berlin avatar Jan 04 '24 11:01 toro-berlin

Hi Tobias, isce2 is in maintenance mode and doesn't really have ETAs. I'll merge this once I'm satisfied that it is a correct fix. If you'd like to speed up this process, you can help out by running this change yourself and letting me know if it works for you :)

rtburns-jpl avatar Jan 04 '24 17:01 rtburns-jpl

Hi @rtburns-jpl ,

Thank you for making the necessary fix. I have tested the change and it appears to be working well. However, I have noticed two issues that may need to be addressed:

(1) I have to hardcode my username and password. Otherwise, I am prompted to provide them for each orbit file. (2) Only precise orbits are being downloaded. When precise orbits are not available, the program stops with errors.

Edit: (2) This only happened for one track, but seems to work fine for another.. Not sure why.

yjzhenglamarmota avatar Jan 10 '24 00:01 yjzhenglamarmota

Hey @rtburns-jpl, thanks for clarifying! A colleague of mine will have a closer look soonish.

toro-berlin avatar Jan 10 '24 07:01 toro-berlin

The orbits can also be retrieved from the ASF at the URLs below which don't require any password. See the curl command to get a listing of the latest res orbits. Maybe these URLs can/should be used in fetchOrbits.py ?

https://s1qc.asf.alaska.edu/aux_poeorb https://s1qc.asf.alaska.edu/aux_resorb

curl --ftp-ssl-reqd --silent --use-ascii --ftp-method nocwd --list-only https://s1qc.asf.alaska.edu/aux_resorb/ | grep 202401

falkamelung avatar Jan 11 '24 20:01 falkamelung

The orbits can also be retrieved from the ASF at the URLs below which don't require any password.

Only the listing is password-less. For downloading, one needs to login with earthdata credentials.

vincentschut avatar Jan 12 '24 09:01 vincentschut

Thanks for the feedback Yujie, I pushed an update so username/password should now be only entered once for multiple file downloads. If you have a test case where missing precise orbits cause that error, I'll see if I can fix that as well.

Since authentication is required for downloading from both copernicus and ASF, I'll stick with copernicus for now, but if there's an alternative that doesn't require auth I can try to switch over to that source.

rtburns-jpl avatar Jan 18 '24 18:01 rtburns-jpl

Thanks for your efforts on this! However, I believe the current approach generates a new token for each file to be downloaded, which is a problem when doing bulk downloads - eventually Copernicus gives a 401 Unauthorized error, and even my attempt to log in to their webpage is now rejected due to "too many sessions".

Some discussion here suggests generating only one token and re-using it: https://helpcenter.dataspace.copernicus.eu/hc/en-gb/community/posts/13783130873117-Parallel-downloads-for-the-same-user

ericlindsey avatar Jan 19 '24 19:01 ericlindsey

For me downloading orbits from the ASF works without giving any password. Below an example. I don't recall to have saved on my computer anything that does the authentication.

wget -c https://s1qc.asf.alaska.edu/aux_resorb/S1A_OPER_AUX_RESORB_OPOD_20240110T182733_V20240110T144317_20240110T180047.EOF

falkamelung avatar Jan 19 '24 19:01 falkamelung

For me downloading orbits from the ASF works without giving any password.

For me, that only works when I have valid earthdata credentials defined in ~/.netrc. If not, the download fails. So it does need a login, but wget takes the credentials from ~/.netrc .

vincentschut avatar Jan 22 '24 11:01 vincentschut

@rtburns-jpl Here's a version that saves the auth token to a file, and only re-creates it if it expired (after 10 minutes): https://github.com/ericlindsey/isce2/blob/fetchOrbit-update/contrib/stack/topsStack/fetchOrbit.py

This is necessary if using 'stackSentinel.py' which runs fetchOrbit inside a loop. Otherwise the Dataspace server eventually rejects the downloads due to too many open sessions.

Feel free to update your version with this, or should I create a separate PR?

ericlindsey avatar Jan 25 '24 20:01 ericlindsey

@rtburns-jpl let's pick this up please. Would you please take a look at @ericlindsey comment above.

hfattahi avatar Apr 08 '24 15:04 hfattahi

Thanks Eric, I merged your auth token changes and it's a significant improvement :)

rtburns-jpl avatar Apr 08 '24 23:04 rtburns-jpl

Thanks! Glad to hear it!

ericlindsey avatar Apr 09 '24 17:04 ericlindsey

Can this be merged, @rtburns-jpl?

bjmarfito avatar Apr 25 '24 08:04 bjmarfito

Thanks for this PR.

If I have my username and password stored in ~/.netrc,

machine urs.earthdata.nasa.gov login blabla password blabla
machine dataspace.copernicus.eu login blabla password blabla

then when 'stackSentinel.py' runs fetchOrbit inside a loop, it will get the correct "dataspace.copernicus.eu" credential?

In my case it runs and downloads to some point, and stops, refusing to make a new token:

requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://identity.dataspace.copernicus.eu/auth/realms/CDSE/protocol/openid-connect/token

I have to do fetchOrbit.py manualy and input the -u and -p flags again to get the new token file created to move on.

Did I get something wrong?

yuankailiu avatar May 02 '24 00:05 yuankailiu

@yuankailiu, you are right that this doesn't read the login details from ~/.netrc, so the login cookie will eventually expire if running inside a loop. To avoid this I hard-coded the username and password on my server by setting them as 'default' values for the input arguments (lines 30 and 32). This isn't ideal so I agree it would be nice to update this to use the stored credentials file instead.

ericlindsey avatar May 02 '24 16:05 ericlindsey

@yuankailiu, so when you ran this script in a loop, you were using an existing token file which then expired and caused the remaining iterations to fail? The 500 error seems to suggest something is wrong on their end, at least according to this document: https://documentation.dataspace.copernicus.eu/APIs/SentinelHub/Overview/ErrorHandling.html

In any case, we should probably make some more tweaks for convenience:

  • Detect when token is expired, and regenerate using credentials (or if credentials were not provided, give a more appropriate error message)
  • Read username/password from .netrc file when possible (should be straightforward using the builtin netrc module: https://docs.python.org/3/library/netrc.html)

rtburns-jpl avatar May 02 '24 19:05 rtburns-jpl

I think the script just can't regenrate the token because stackSentinel.py runs fetchOrbit.py with only the default options (so no username/password is input). This expired token is what causes the error I believe, though I'm not sure why it gives a generic 500 and not something more informative.

I think the second suggestion is what needs to be added - fetchOrbit should read the netrc file for the username/password rather than having it as an input parameter on the command line.

ericlindsey avatar May 02 '24 19:05 ericlindsey

Ok, I added .netrc file reading in that last commit, give it a go and let me know if it works for you.

rtburns-jpl avatar May 02 '24 19:05 rtburns-jpl

The netrc reading works well.

Before, when there is no token file provided or the token is outdated, this 500 Server Error arose: ykliu@kamb$ fetchOrbit.py -i S1A_IW_SLC__1SDV_20170303T141528_20170303T141555_015529_01983C_E4EC.zip -o .

Reference time:  2017-03-03 14:15:55
Satellite name:  S1A
generating a new access token
Traceback (most recent call last):
  File "/home/ykliu/apps/isce2/src/isce2/contrib/stack/topsStack/fetchOrbit.py", line 201, in <module>
    token, expires_in = get_new_token(username, password, session)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ykliu/apps/isce2/src/isce2/contrib/stack/topsStack/fetchOrbit.py", line 91, in get_new_token
    response.raise_for_status()
  File "/home/ykliu/apps/mambaforge/envs/geod/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://identity.dataspace.copernicus.eu/auth/realms/CDSE/protocol/openid-connect/token

Now with the .netrc reading, token got generated: ykliu@kamb$ fetchOrbit.py -i S1A_IW_SLC__1SDV_20170303T141528_20170303T141555_015529_01983C_E4EC.zip -o .

Reference time:  2017-03-03 14:15:55
Satellite name:  S1A
generating a new access token
Downloading URL:  https://zipper.dataspace.copernicus.eu/odata/v1/Products(2310867d-d741-4b43-958b-d01d8409bb76)/$value

yuankailiu avatar May 02 '24 20:05 yuankailiu

Not sure if it is something to consider. But can we also make fetchOrbit.py to read directly from a folder and loop over the *zip files? Sort of allowing a stand-alone functionality to pre-download all the orbit files outside stackSentinel.py if you want it.

I did this -d (directory) flag here, using glob.glob and loop over zip files: https://github.com/yuankailiu/isce2/commit/ee2d286b8dcb56363503c0d421610d9f16cc49e4

yuankailiu avatar May 02 '24 20:05 yuankailiu

Sure, if others find that useful I'd be happy to merge it. For now I'll keep this PR focused on just updating to support the new data source but feel free to open a new one with your change once this one gets merged.

rtburns-jpl avatar May 02 '24 20:05 rtburns-jpl