data-subscriber icon indicating copy to clipboard operation
data-subscriber copied to clipboard

Refine subscriber/downloader behavior with metadata (.md5, .xml)

Open celiaou-podaac opened this issue 2 years ago • 0 comments

This use case comes from a data provider, who is for now using a workaround. When metadata files (.md5, .xml) are requested using the option '-e', both the https and s3 links for the same file are returned, and the downloads either fail (example 1) or skip (example 2) from the s3 links. Thus, the tool reports double the number of files found, and proceeds to 'fail' or 'skip' half of them. We would like the tool to only report back the https links, as is done for other extensions.

Example 1 (restricted access): podaac-data-subscriber -c SWOTCalVal_GNSS_L0_1.0 -d ./newTest -sd 2023-05-10T00:00:00Z -ed 2023-06-01T00:00:00Z -e .xml [...] [2023-08-09 14:21:15,425] {podaac_data_subscriber.py:329} INFO - Downloaded Files: 3 [2023-08-09 14:21:15,429] {podaac_data_subscriber.py:330} INFO - Failed Files: 3 [2023-08-09 14:21:15,433] {podaac_data_subscriber.py:331} INFO - Skipped Files: 0 [2023-08-09 14:21:16,330] {podaac_data_subscriber.py:339} INFO - END

Example 2: podaac-data-subscriber -c OSCAR_L4_OC_FINAL_V2.0 -d ./xmltest -e .md5 -sd 1995-01-01T00:00:00Z -ed 1995-01-03T00:00:00Z

(base) C:\Projects>podaac-data-subscriber -c OSCAR_L4_OC_FINAL_V2.0 -d ./xmltest -e .md5 -sd 1995-01-01T00:00:00Z -ed 1995-01-03T00:00:00Z [2023-08-09 13:47:46,865] {podaac_data_subscriber.py:183} WARNING - No .update__OSCAR_L4_OC_FINAL_V2.0 in the data directory. (Is this the first run?) [2023-08-09 13:47:48,854] {podaac_data_subscriber.py:275} INFO - Found 6 total files to download [2023-08-09 13:47:49,673] {podaac_data_subscriber.py:314} INFO - 2023-08-09 13:47:49.672906 SUCCESS: https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-public/OSCAR_L4_OC_FINAL_V2.0/oscar_currents_final_19950103.nc.md5 [2023-08-09 13:47:49,683] {podaac_data_subscriber.py:306} INFO - 2023-08-09 13:47:49.683233 SKIPPED: s3://podaac-ops-cumulus-public/OSCAR_L4_OC_FINAL_V2.0/oscar_currents_final_19950103.nc.md5 [2023-08-09 13:47:50,127] {podaac_data_subscriber.py:314} INFO - 2023-08-09 13:47:50.127747 SUCCESS: https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-public/OSCAR_L4_OC_FINAL_V2.0/oscar_currents_final_19950102.nc.md5 [2023-08-09 13:47:50,128] {podaac_data_subscriber.py:306} INFO - 2023-08-09 13:47:50.128978 SKIPPED: s3://podaac-ops-cumulus-public/OSCAR_L4_OC_FINAL_V2.0/oscar_currents_final_19950102.nc.md5 [2023-08-09 13:47:50,596] {podaac_data_subscriber.py:314} INFO - 2023-08-09 13:47:50.596905 SUCCESS: https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-public/OSCAR_L4_OC_FINAL_V2.0/oscar_currents_final_19950101.nc.md5 [2023-08-09 13:47:50,598] {podaac_data_subscriber.py:306} INFO - 2023-08-09 13:47:50.598805 SKIPPED: s3://podaac-ops-cumulus-public/OSCAR_L4_OC_FINAL_V2.0/oscar_currents_final_19950101.nc.md5 [2023-08-09 13:47:50,601] {podaac_data_subscriber.py:329} INFO - Downloaded Files: 3 [2023-08-09 13:47:50,604] {podaac_data_subscriber.py:330} INFO - Failed Files: 0 [2023-08-09 13:47:50,605] {podaac_data_subscriber.py:331} INFO - Skipped Files: 3 [2023-08-09 13:47:50,800] {podaac_data_subscriber.py:339} INFO - END

celiaou-podaac avatar Aug 09 '23 21:08 celiaou-podaac