earthaccess
earthaccess copied to clipboard
Add `dataset.services()` method to list available Harmony services
Github Issue: 447
Description
List available Harmony services for a collection. As a first step to facilitating the use of services in earthaccess, earthaccess was modified so that it can list the available services for a collection.
Overview of work done
A new services module was created that performs service queries. The results module was updated to include a services function which will use the services module to query a collection's services and return the provider-id and umm JSON for the service.
Sample services results:
{
"provider-id": "POCLOUD",
"umm": {
"URL": {
"Description": "the main access point",
"URLValue": "https://opendap.earthdata.nasa.gov/"
},
"Type": "OPeNDAP",
"ServiceKeywords": [
{
"ServiceCategory": "EARTH SCIENCE SERVICES",
"ServiceTopic": "DATA MANAGEMENT/DATA HANDLING",
"ServiceTerm": "DATA SUBSETTING/SUPERSETTING",
"ServiceSpecificTerm": "SPATIAL SUBSETTING"
},
{
"ServiceCategory": "EARTH SCIENCE SERVICES",
"ServiceTopic": "DATA MANAGEMENT/DATA HANDLING",
"ServiceTerm": "DATA SUBSETTING/SUPERSETTING",
"ServiceSpecificTerm": "TEMPORAL SUBSETTING"
},
{
"ServiceCategory": "EARTH SCIENCE SERVICES",
"ServiceTopic": "DATA MANAGEMENT/DATA HANDLING",
"ServiceTerm": "DATA SUBSETTING/SUPERSETTING",
"ServiceSpecificTerm": "VARIABLE SUBSETTING"
}
],
"ServiceOrganizations": [
{
"Roles": [
"DEVELOPER"
],
"ShortName": "UCAR/UNIDATA",
"LongName": "Unidata, University Corporation for Atmospheric Research"
}
],
"OperationMetadata": [
{
"DistributedComputingPlatform": [
"WEBSERVICES"
]
}
],
"Description": "Earthdata OPEnDAP in the cloud",
"Version": "9",
"Name": "PO.DAAC Cloud OPeNDAP",
"ServiceOptions": {
"Subset": {
"SpatialSubset": {
"BoundingBox": {
"AllowMultipleValues": false
}
},
"TemporalSubset": {
"AllowMultipleValues": false
},
"VariableSubset": {
"AllowMultipleValues": true
}
},
"SupportedReformattings": [
{
"SupportedInputFormat": "NETCDF-4",
"SupportedOutputFormats": [
"ASCII",
"CSV",
"NETCDF-3",
"NETCDF-4"
]
},
{
"SupportedInputFormat": "HDF5",
"SupportedOutputFormats": [
"ASCII",
"CSV",
"NETCDF-3",
"NETCDF-4"
]
}
]
},
"MetadataSpecification": {
"URL": "https://cdn.earthdata.nasa.gov/umm/service/v1.5.2",
"Name": "UMM-S",
"Version": "1.5.2"
},
"LongName": "PO.DAAC OPeNDADP In the Cloud"
}
}
Overview of verification done
Tested new service functionality on four collections:
- HLSS30
- MUR-JPL-L4-GLOB-v4.1
- VIIRS_NPP-JPL-L2P-v2016.2
- ATL22
- TOMSN7SO2
Overview of integration done
Created a new unit test to test DataService get. Unit test coverage:
---------- coverage: platform linux, python 3.11.6-final-0 -----------
Name Stmts Miss Cover Missing
---------------------------------------------------------------------
earthaccess/__init__.py 31 3 90% 77-81
earthaccess/api.py 107 77 28% 26-28, 65-78, 117, 124, 143-158, 182-193, 211-213, 234-239, 248-252, 261-265, 284-285, 307-308, 328-336, 345-346, 350-355
earthaccess/auth.py 207 80 61% 17-18, 49-63, 95-96, 122-154, 188-206, 211-215, 242, 251-252, 260-270, 316-317, 346-358, 362-372, 384
earthaccess/daac.py 20 6 70% 136, 140-147
earthaccess/formatters.py 17 10 41% 11, 18, 22-59
earthaccess/kerchunk.py 32 26 19% 13-22, 32-58
earthaccess/results.py 152 81 47% 28-31, 34-47, 88-97, 107-109, 117, 124-130, 137-139, 146-148, 155-158, 165-166, 174-176, 184-200, 212-220, 223, 258-261, 268-276, 283-284, 287-290, 306-317, 321-329, 355, 379-380
earthaccess/search.py 305 139 54% 47, 63-72, 88-89, 99-100, 114, 129-133, 146-150, 165-180, 184-186, 194-195, 203-204, 217, 235-236, 244, 276-312, 374, 392-396, 421, 441-442, 450, 458-464, 474-475, 489-498, 511-516, 525-526, 534-535, 543-544, 552-553, 564-565, 573-574, 582, 588, 628, 634-638, 641, 650, 654, 660-662, 665, 678-679, 721-722, 731-732, 741-742, 772-773, 782-783, 795-803
earthaccess/services.py 34 6 82% 31, 51, 57-58, 61, 66
earthaccess/store.py 289 190 34% 27-28, 31, 34, 42, 50-55, 62-75, 82-86, 109-114, 118-121, 124-125, 128-131, 134-137, 148, 156-159, 183-190, 193-194, 214, 224, 239-242, 292, 310-312, 331, 340-385, 394-440, 467-478, 506, 516-536, 546-582, 595-618, 634-649, 656-665
earthaccess/utils/_validation.py 5 3 40% 5-7
earthaccess/widgets.py 0 0 100%
tests/unit/__init__.py 0 0 100%
tests/unit/conftest.py 0 0 100%
tests/unit/test_auth.py 60 2 97% 133-134
tests/unit/test_collection_queries.py 30 0 100%
tests/unit/test_formatters.py 0 0 100%
tests/unit/test_granule_queries.py 22 0 100%
tests/unit/test_results.py 22 0 100%
tests/unit/test_store.py 58 0 100%
---------------------------------------------------------------------
TOTAL 1391 623 55%
========================================================================================================= 26 passed, 2 warnings in 1.75s =========================================================================================================
+ bash ./scripts/lint.sh
+ mypy earthaccess --disallow-untyped-defs
Success: no issues found in 12 source files
+ ruff check .
Create a new integration test that tests the services functionality from the service query to the parsing of query results. Integration test coverage:
---------- coverage: platform linux, python 3.11.6-final-0 -----------
Name Stmts Miss Cover Missing
-------------------------------------------------------------------------
earthaccess/__init__.py 31 4 87% 68-72, 75
earthaccess/api.py 107 33 69% 77, 124, 144-152, 190-192, 213, 237-238, 248-252, 261-265, 284-285, 331-334, 336, 345-346
earthaccess/auth.py 207 59 71% 17-18, 122-154, 188-206, 211-212, 242, 251-252, 260-261, 266, 269, 346-358, 362-372, 384
earthaccess/daac.py 20 2 90% 136, 147
earthaccess/formatters.py 17 10 41% 11, 18, 22-59
earthaccess/kerchunk.py 32 26 19% 13-22, 32-58
earthaccess/results.py 152 58 62% 28-31, 34-47, 88-97, 107-109, 124-130, 137-139, 146-148, 155-158, 165-166, 174-176, 193, 223, 258-261, 268-276, 283-284, 287-290, 316-317, 321-329, 355, 379-380
earthaccess/search.py 305 81 73% 69-70, 88-89, 114, 129-133, 146-150, 172, 184-186, 194-195, 203-204, 217, 235-236, 284, 290-294, 297, 304, 392-396, 421, 441-442, 459, 474-475, 490, 511-516, 525-526, 543-544, 552-553, 574, 582, 628, 634-638, 641, 650, 662, 678-679, 721-722, 731-732, 741-742, 772-773, 782-783, 795-803
earthaccess/services.py 34 6 82% 31, 51, 57-58, 61, 66
earthaccess/store.py 289 89 69% 34, 42, 62-75, 109-114, 118-121, 124-125, 128-131, 159, 183-190, 193-194, 214, 224, 241-242, 312, 331, 345, 373-374, 383-385, 394-440, 468-470, 478, 506, 523-536, 596, 612-617, 635, 637, 660-661
earthaccess/utils/_validation.py 5 0 100%
earthaccess/widgets.py 0 0 100%
tests/integration/conftest.py 9 5 44% 8-12
tests/integration/test_api.py 51 0 100%
tests/integration/test_auth.py 82 15 82% 56, 64-65, 87-95, 106, 117-118
tests/integration/test_cloud_download.py 80 13 84% 88-89, 130-133, 141-142, 147-156, 169
tests/integration/test_cloud_open.py 82 5 94% 101, 134-135, 159, 169
tests/integration/test_kerchunk.py 41 33 20% 11-87
tests/integration/test_onprem_download.py 82 5 94% 94, 127-128, 159, 161
tests/integration/test_onprem_open.py 75 4 95% 93, 126-127, 151
tests/integration/test_services.py 24 0 100%
-------------------------------------------------------------------------
TOTAL 1725 448 74%
================================================================================================================= short test summary info ==================================================================================================================
FAILED tests/integration/test_api.py::test_download[True-0] - ValueError: earthaccess can't yet guess the provider for cloud collections, we need to use one from earthaccess.list_cloud_providers()
FAILED tests/integration/test_api.py::test_download[True-selection1] - ValueError: earthaccess can't yet guess the provider for cloud collections, we need to use one from earthaccess.list_cloud_providers()
FAILED tests/integration/test_auth.py::test_auth_can_read_from_netrc_file - AssertionError: False is not true
FAILED tests/integration/test_auth.py::test_auth_populates_attrs - AssertionError: False is not true
FAILED tests/integration/test_auth.py::test_auth_can_fetch_s3_credentials - AssertionError: False is not true
FAILED tests/integration/test_auth.py::test_get_s3_credentials_lowercase_location[location0] - assert {}
FAILED tests/integration/test_auth.py::test_get_s3_credentials_lowercase_location[location1] - assert {}
FAILED tests/integration/test_auth.py::test_get_s3fs_session_lowercase_location[location0] - requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
FAILED tests/integration/test_auth.py::test_get_s3fs_session_lowercase_location[location1] - requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
FAILED tests/integration/test_cloud_download.py::test_earthaccess_can_download_cloud_collection_granules[daac0] - AssertionError: assert False
FAILED tests/integration/test_cloud_download.py::test_earthaccess_can_download_cloud_collection_granules[daac1] - AssertionError: assert False
FAILED tests/integration/test_cloud_download.py::test_earthaccess_can_download_cloud_collection_granules[daac2] - AssertionError: assert False
FAILED tests/integration/test_cloud_download.py::test_earthaccess_can_download_cloud_collection_granules[daac3] - AssertionError: assert False
FAILED tests/integration/test_cloud_download.py::test_earthaccess_can_download_cloud_collection_granules[daac4] - AssertionError: assert False
FAILED tests/integration/test_cloud_download.py::test_multi_file_granule - requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
FAILED tests/integration/test_cloud_open.py::test_multi_file_granule - requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
FAILED tests/integration/test_onprem_download.py::test_earthaccess_can_download_onprem_collection_granules[daac0] - AssertionError: False is not true
FAILED tests/integration/test_onprem_download.py::test_earthaccess_can_download_onprem_collection_granules[daac1] - AssertionError: False is not true
============================================================================================= 18 failed, 52 passed, 1 skipped, 3 warnings in 174.40s (0:02:54) =============================================================================================
PR checklist:
- [X] Linted & Formatted
- [X] Updated unit & integration tests
- [X] Updated changelog & readme
- [X] Updated documentation
Pending:
- [ ] Pass currently failing integration tests.
📚 Documentation preview 📚: https://earthaccess--500.org.readthedocs.build/en/500/
It seems like the existing integration tests are failing due to credentials which I had set up as environment variables and in a .netrc file when running them. What is the best way to handle credentials in these tests? Also happy to move this to a draft if we want to work out those details first.
Hope you don't mind the rename, was finding the old name difficult to remember in my notifications :)
What is the best way to handle credentials in these tests?
@betolink @andypbarrett I think this may need documenting! As a developer, how do I run integration tests on my laptop?
@mfisher87 - I think that I was able to fix the integration test and also modified the services unit and integration tests to use VCR. All unit tests are passing.
I ran the integration tests locally and here is the summary:
==================== short test summary info =================================
FAILED tests/integration/test_auth.py::test_auth_can_read_from_netrc_file - AssertionError: False is not true FAILED tests/integration/test_auth.py::test_auth_populates_attrs - AssertionError: False is not true FAILED tests/integration/test_auth.py::test_auth_can_fetch_s3_credentials - AssertionError: False is not true FAILED tests/integration/test_auth.py::test_get_s3_credentials_lowercase_location[location0] - assert {} FAILED tests/integration/test_auth.py::test_get_s3_credentials_lowercase_location[location1] - assert {} FAILED tests/integration/test_auth.py::test_get_s3fs_session_lowercase_location[location0] - requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0) FAILED tests/integration/test_auth.py::test_get_s3fs_session_lowercase_location[location1] - requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
====== 7 failed, 71 passed, 1 skipped in 496.39s (0:08:16) ==========================================
I wonder if these tests need to be run in the same AWS region as the data?
Other than that I think this PR can be merged but let me know how that typically works for earthaccess development.
Will you be at hack day this coming week? I may not have time to review this before then :grimacing:
@mfisher87 - Unfortunately I am on travel this coming week so I can't make it but I do plan to attend the next earthaccess hack day in a few weeks. We can discuss then if it's easier! 😄
Sounds like a good plan :) Safe and fun travels!
@nikki-t I'll only attend 2nd half of hack day this week, had to schedule a dentist appointment at that time. I may be co-working on that call (I've used my funded Openscapes allocation), but feel free to interrupt me so we can have a quick chat. I love the how-to you added :star_struck:
@JessicaS11 - Thank you for the suggested updates, I have applied them!
@mfisher87 , @betolink - Do you think this PR is ready to be merged?
Hi @nikki-t the code looks good to me, there are some failing tests due a print statement which I'm not opposed to, @mfisher87do you know if there is a way to skip this rule on a file?
Hi @nikki-t the code looks good to me, there are some failing tests due a print statement which I'm not opposed to, @mfisher87do you know if there is a way to skip this rule on a file?
Please use logging, not printing. We addressed this with https://github.com/nsidc/earthaccess/issues/511.
Ahh right @chuckwondo cc @nikki-t
Suggested modifications have been made and unit and integration tests have been updated.
Unit Test Summary
================================================================================================================ short test summary info =================================================================================================================
FAILED tests/unit/test_collection_queries.py::test_query_can_parse_single_dates[2001-12-12-2001-12-21-2001-12-12T00:00:00Z,2001-12-21T23:59:59Z] - ValueError: time data '2001-12-12' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_collection_queries.py::test_query_can_parse_single_dates[2021-02-01--2021-02-01T00:00:00Z,] - ValueError: time data '2021-02-01' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_collection_queries.py::test_query_can_parse_single_dates[1999-02-01 06:00-2009-01-01-1999-02-01T06:00:00Z,2009-01-01T23:59:59Z] - ValueError: time data '1999-02-01 06:00' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_collection_queries.py::test_query_can_parse_single_dates[2019-03-10T00:00:00Z-2019-03-10T00:00:00-01:00-2019-03-10T00:00:00Z,2019-03-10T01:00:00Z] - ValueError: time data '2019-03-10T00:00:00-01:00' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_granule_queries.py::test_query_can_parse_single_dates[2001-12-12-2001-12-21-2001-12-12T00:00:00Z,2001-12-21T23:59:59Z] - ValueError: time data '2001-12-12' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_granule_queries.py::test_query_can_parse_single_dates[2021-02-01--2021-02-01T00:00:00Z,] - ValueError: time data '2021-02-01' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_granule_queries.py::test_query_can_parse_single_dates[1999-02-01 06:00-2009-01-01-1999-02-01T06:00:00Z,2009-01-01T23:59:59Z] - ValueError: time data '1999-02-01 06:00' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_granule_queries.py::test_query_can_parse_single_dates[2019-03-10T00:00:00Z-2019-03-10T00:00:00-01:00-2019-03-10T00:00:00Z,2019-03-10T01:00:00Z] - ValueError: time data '2019-03-10T00:00:00-01:00' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_results.py::TestResults::test_data_links - ValueError: time data '2020' does not match format '%Y-%m-%dT%H:%M:%SZ'
============================================================================================================== 9 failed, 31 passed in 2.56s ==============================================================================================================
Integration Test Summary
================================================================================================================ short test summary info =================================================================================================================
FAILED tests/integration/test_auth.py::test_auth_can_read_earthdata_env_variables - AttributeError: 'Auth' object has no attribute 'username'
FAILED tests/integration/test_auth.py::test_auth_can_read_from_netrc_file - AssertionError: False is not true
FAILED tests/integration/test_auth.py::test_auth_populates_attrs - AssertionError: False is not true
FAILED tests/integration/test_auth.py::test_auth_can_fetch_s3_credentials - AssertionError: False is not true
FAILED tests/integration/test_auth.py::test_get_s3_credentials_lowercase_location[location0] - assert {}
FAILED tests/integration/test_auth.py::test_get_s3_credentials_lowercase_location[location1] - assert {}
FAILED tests/integration/test_onprem_download.py::test_earthaccess_can_download_onprem_collection_granules[daac1] - AssertionError: False is not true
FAILED tests/integration/test_onprem_download.py::test_earthaccess_can_download_onprem_collection_granules[daac3] - AssertionError: 0 not greater than 3
FAILED tests/integration/test_onprem_open.py::test_earthaccess_can_open_onprem_collection_granules[daac3] - AssertionError: 0 not greater than 2
FAILED tests/unit/test_collection_queries.py::test_query_can_parse_single_dates[2001-12-12-2001-12-21-2001-12-12T00:00:00Z,2001-12-21T23:59:59Z] - ValueError: time data '2001-12-12' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_collection_queries.py::test_query_can_parse_single_dates[2021-02-01--2021-02-01T00:00:00Z,] - ValueError: time data '2021-02-01' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_collection_queries.py::test_query_can_parse_single_dates[1999-02-01 06:00-2009-01-01-1999-02-01T06:00:00Z,2009-01-01T23:59:59Z] - ValueError: time data '1999-02-01 06:00' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_collection_queries.py::test_query_can_parse_single_dates[2019-03-10T00:00:00Z-2019-03-10T00:00:00-01:00-2019-03-10T00:00:00Z,2019-03-10T01:00:00Z] - ValueError: time data '2019-03-10T00:00:00-01:00' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_granule_queries.py::test_query_can_parse_single_dates[2001-12-12-2001-12-21-2001-12-12T00:00:00Z,2001-12-21T23:59:59Z] - ValueError: time data '2001-12-12' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_granule_queries.py::test_query_can_parse_single_dates[2021-02-01--2021-02-01T00:00:00Z,] - ValueError: time data '2021-02-01' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_granule_queries.py::test_query_can_parse_single_dates[1999-02-01 06:00-2009-01-01-1999-02-01T06:00:00Z,2009-01-01T23:59:59Z] - ValueError: time data '1999-02-01 06:00' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_granule_queries.py::test_query_can_parse_single_dates[2019-03-10T00:00:00Z-2019-03-10T00:00:00-01:00-2019-03-10T00:00:00Z,2019-03-10T01:00:00Z] - ValueError: time data '2019-03-10T00:00:00-01:00' does not match format '%Y-%m-%dT%H:%M:%SZ'
FAILED tests/unit/test_results.py::TestResults::test_data_links - ValueError: time data '2020' does not match format '%Y-%m-%dT%H:%M:%SZ'
============================================================================================ 18 failed, 66 passed, 1 skipped, 1 warning in 596.56s (0:09:56) =============================================================================================
It looks like a lot of the tests are failing due to the date format. I pulled in the most recent changes from main to see if they had been fixed but it doesn't look like. I also typically see some test failures when running the integration tests locally.
@mfisher87 or @chuckwondo - Do you mind approving the workflows so that they can run? If these pass I think we might be in good shape to merge the PR.
Workflows kicked off! You should not have to worry about that... I sent you an invitation with "triage" rights (includes ability to add labels, close issues, etc., but not quite as much as "maintainer") you can accept or decline :)
If this isn't merged before next hack day, let's do it then? How are you feeling @chuckwondo ?