Update fetchOrbit.py for new Copernicus data source
All kudos go to Scott for already figuring out how to do this using python requests, super easy to port this over from sentineleof vs. trying to translate the curl calls from the Copernicus docs.
Fixes https://github.com/isce-framework/isce2/issues/792
Hey @rtburns-jpl,
Thanks for providing the fix. Do you have an ETA for the merge to the main branch?
All the best for the new year!
Hi Tobias, isce2 is in maintenance mode and doesn't really have ETAs. I'll merge this once I'm satisfied that it is a correct fix. If you'd like to speed up this process, you can help out by running this change yourself and letting me know if it works for you :)
Hi @rtburns-jpl ,
Thank you for making the necessary fix. I have tested the change and it appears to be working well. However, I have noticed two issues that may need to be addressed:
(1) I have to hardcode my username and password. Otherwise, I am prompted to provide them for each orbit file. (2) Only precise orbits are being downloaded. When precise orbits are not available, the program stops with errors.
Edit: (2) This only happened for one track, but seems to work fine for another.. Not sure why.
Hey @rtburns-jpl, thanks for clarifying! A colleague of mine will have a closer look soonish.
The orbits can also be retrieved from the ASF at the URLs below which don't require any password. See the curl command to get a listing of the latest res orbits. Maybe these URLs can/should be used in fetchOrbits.py ?
https://s1qc.asf.alaska.edu/aux_poeorb https://s1qc.asf.alaska.edu/aux_resorb
curl --ftp-ssl-reqd --silent --use-ascii --ftp-method nocwd --list-only https://s1qc.asf.alaska.edu/aux_resorb/ | grep 202401
The orbits can also be retrieved from the ASF at the URLs below which don't require any password.
Only the listing is password-less. For downloading, one needs to login with earthdata credentials.
Thanks for the feedback Yujie, I pushed an update so username/password should now be only entered once for multiple file downloads. If you have a test case where missing precise orbits cause that error, I'll see if I can fix that as well.
Since authentication is required for downloading from both copernicus and ASF, I'll stick with copernicus for now, but if there's an alternative that doesn't require auth I can try to switch over to that source.
Thanks for your efforts on this! However, I believe the current approach generates a new token for each file to be downloaded, which is a problem when doing bulk downloads - eventually Copernicus gives a 401 Unauthorized error, and even my attempt to log in to their webpage is now rejected due to "too many sessions".
Some discussion here suggests generating only one token and re-using it: https://helpcenter.dataspace.copernicus.eu/hc/en-gb/community/posts/13783130873117-Parallel-downloads-for-the-same-user
For me downloading orbits from the ASF works without giving any password. Below an example. I don't recall to have saved on my computer anything that does the authentication.
wget -c https://s1qc.asf.alaska.edu/aux_resorb/S1A_OPER_AUX_RESORB_OPOD_20240110T182733_V20240110T144317_20240110T180047.EOF
For me downloading orbits from the ASF works without giving any password.
For me, that only works when I have valid earthdata credentials defined in ~/.netrc. If not, the download fails. So it does need a login, but wget takes the credentials from ~/.netrc .
@rtburns-jpl Here's a version that saves the auth token to a file, and only re-creates it if it expired (after 10 minutes): https://github.com/ericlindsey/isce2/blob/fetchOrbit-update/contrib/stack/topsStack/fetchOrbit.py
This is necessary if using 'stackSentinel.py' which runs fetchOrbit inside a loop. Otherwise the Dataspace server eventually rejects the downloads due to too many open sessions.
Feel free to update your version with this, or should I create a separate PR?
@rtburns-jpl let's pick this up please. Would you please take a look at @ericlindsey comment above.
Thanks Eric, I merged your auth token changes and it's a significant improvement :)
Thanks! Glad to hear it!
Can this be merged, @rtburns-jpl?
Thanks for this PR.
If I have my username and password stored in ~/.netrc,
machine urs.earthdata.nasa.gov login blabla password blabla
machine dataspace.copernicus.eu login blabla password blabla
then when 'stackSentinel.py' runs fetchOrbit inside a loop, it will get the correct "dataspace.copernicus.eu" credential?
In my case it runs and downloads to some point, and stops, refusing to make a new token:
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://identity.dataspace.copernicus.eu/auth/realms/CDSE/protocol/openid-connect/token
I have to do fetchOrbit.py manualy and input the -u and -p flags again to get the new token file created to move on.
Did I get something wrong?
@yuankailiu, you are right that this doesn't read the login details from ~/.netrc, so the login cookie will eventually expire if running inside a loop. To avoid this I hard-coded the username and password on my server by setting them as 'default' values for the input arguments (lines 30 and 32). This isn't ideal so I agree it would be nice to update this to use the stored credentials file instead.
@yuankailiu, so when you ran this script in a loop, you were using an existing token file which then expired and caused the remaining iterations to fail? The 500 error seems to suggest something is wrong on their end, at least according to this document: https://documentation.dataspace.copernicus.eu/APIs/SentinelHub/Overview/ErrorHandling.html
In any case, we should probably make some more tweaks for convenience:
- Detect when token is expired, and regenerate using credentials (or if credentials were not provided, give a more appropriate error message)
- Read username/password from .netrc file when possible (should be straightforward using the builtin netrc module: https://docs.python.org/3/library/netrc.html)
I think the script just can't regenrate the token because stackSentinel.py runs fetchOrbit.py with only the default options (so no username/password is input). This expired token is what causes the error I believe, though I'm not sure why it gives a generic 500 and not something more informative.
I think the second suggestion is what needs to be added - fetchOrbit should read the netrc file for the username/password rather than having it as an input parameter on the command line.
Ok, I added .netrc file reading in that last commit, give it a go and let me know if it works for you.
The netrc reading works well.
Before, when there is no token file provided or the token is outdated, this 500 Server Error arose:
ykliu@kamb$ fetchOrbit.py -i S1A_IW_SLC__1SDV_20170303T141528_20170303T141555_015529_01983C_E4EC.zip -o .
Reference time: 2017-03-03 14:15:55
Satellite name: S1A
generating a new access token
Traceback (most recent call last):
File "/home/ykliu/apps/isce2/src/isce2/contrib/stack/topsStack/fetchOrbit.py", line 201, in <module>
token, expires_in = get_new_token(username, password, session)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ykliu/apps/isce2/src/isce2/contrib/stack/topsStack/fetchOrbit.py", line 91, in get_new_token
response.raise_for_status()
File "/home/ykliu/apps/mambaforge/envs/geod/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://identity.dataspace.copernicus.eu/auth/realms/CDSE/protocol/openid-connect/token
Now with the .netrc reading, token got generated:
ykliu@kamb$ fetchOrbit.py -i S1A_IW_SLC__1SDV_20170303T141528_20170303T141555_015529_01983C_E4EC.zip -o .
Reference time: 2017-03-03 14:15:55
Satellite name: S1A
generating a new access token
Downloading URL: https://zipper.dataspace.copernicus.eu/odata/v1/Products(2310867d-d741-4b43-958b-d01d8409bb76)/$value
Not sure if it is something to consider. But can we also make fetchOrbit.py to read directly from a folder and loop over the *zip files? Sort of allowing a stand-alone functionality to pre-download all the orbit files outside stackSentinel.py if you want it.
I did this -d (directory) flag here, using glob.glob and loop over zip files:
https://github.com/yuankailiu/isce2/commit/ee2d286b8dcb56363503c0d421610d9f16cc49e4
Sure, if others find that useful I'd be happy to merge it. For now I'll keep this PR focused on just updating to support the new data source but feel free to open a new one with your change once this one gets merged.