data-subscriber
data-subscriber copied to clipboard
Not finding data to download in GRACEFO_L2_CSR_MONTHLY_0060 dataset
I spoke too soon. The downloader is still working flawlessly for the Sentinel-6 data (thank you!). With optimism in mind, I moved to switch over my GRACE monthly spherical harmonic downloads to PODAAC as well. And… can’t get it to work.
Here’s the command I typed in: podaac-data-downloader -c GRACEFO_L2_CSR_MONTHLY_0060 -d /Volumes/DataDisk/GRACE_RL06/CSR_SPHARM_60 -sd 2018-01-01T00:00:00Z -ed 2018-12-31T00:00:00Z
I’m looking for this data (which is supposed to be “cloud enabled” now – I checked this time!) https://podaac.jpl.nasa.gov/dataset/GRACEFO_L2_CSR_MONTHLY_0060
And downloading by year, so for the first run, I was looking for all the data from 2018-1-1 to 2018-12-31.
When I called that, all the output I got was:
Found 0 total files to download
Downloaded: 0 files
Files Failed to download:0
Above issue by Jennifer Bonin
Short answer, add -e “”
to your command line.
podaac-data-downloader -c GRACEFO_L2_CSR_MONTHLY_0060 -d /Volumes/DataDisk/GRACE_RL06/CSR_SPHARM_60 -sd 2018-01-01T00:00:00Z -ed 2018-12-31T00:00:00Z -e ""
Longer answer:
There is no suffix on these data files, so the default list of suffix it looks for don’t find it. The default list is looking for:
-e EXTENSIONS, --extensions EXTENSIONS
The extensions of products to download. Default is [.nc, .h5, .zip, .tar.gz]
We’ll create a fix for that.
-Mike
After doing some regression testing, adding an empty extension causes a whole slew of problems for other products (Yay testing!). I don't think we have some options, but no default list will really work.
- Default to download everything, and let user specify their own extensions if they prefer. The big issue with this is that it breaks backwards compatability (e.g. a user will get a bunch fo new files). do we download metadata files, and does this enable users to download more data/files that they want and have a direct impact on DAAC Costs.
- Keep the defaults and let a user know when the filters cause an issue.
I'm leaning towards option 2 since 1) it's easier to implement now and 2) it maintains the current paradigm of down selecting data and won't break existing workflows.
here's an example message
gangl$ python subscriber/podaac_data_downloader.py -c GRACEFO_L2_CSR_MONTHLY_0060 -d ./GRACE_RL06/CSR_SPHARM_60 -sd 2018-01-01T00:00:00Z -ed 2018-12-31T00:00:00Z --limit 2
Warning: only the most recent 2 granules will be downloaded; try adjusting your search criteria (suggestion: reduce time period or spatial region of search) to ensure you retrieve all granules.
WARNING: All downloads have been filtered out based on the user provided or default extensions: ['.nc', '.h5', '.zip', '.tar.gz', '.nc4', '.nc3']. Please specify an extension (-e) that allows for downlaoding of these files.
WARNING: Example data files for this collection include[['https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-protected/GRACEFO_L2_CSR_MONTHLY_0060/GSM-2_2018335-2018365_GRFO_UTCSR_BB01_0600']]
Found 0 total files to download
Downloaded: 0 files
My two cents: the return warning message can also include the number of files actually found, and list a couple of example files so people can specify the extension from the example file.