cmr-stac
cmr-stac copied to clipboard
`eo:cloud_cover` query parameter returns empty search
It seems that the eo:cloud_cover
is not correctly filtering STAC Items in the CMR STAC API. If the query parameter is included, the pystac_client.Client.search()
returns 0 resulting items, but there are valid items in the catalogs with this property:
from pystac_client import Client
cmr_earthdata_api = 'https://cmr.earthdata.nasa.gov/stac/LPCLOUD'
cmr_earthdata_client = Client.open(url=cmr_earthdata_api)
search_results = cmr_earthdata_client.search(
collections=['HLSL30.v2.0'],
datetime='2021-02-01/2021-03-01',
intersects=Point(-73.97, 40.78),
query=["eo:cloud_cover<20"]
)
print(len(search_results)) # shows 0 for no results returned from API
If we modify the code snippet above slightly to comment out the query=["eo:cloud_cover<20"]
then the search returns 2 valid items which can be seen to have the appropriate eo:cloud_cover
metadata property:
...
search_results = cmr_earthdata_client.search(
collections=['HLSL30.v2.0'],
datetime='2021-02-01/2021-03-01',
intersects=Point(-73.97, 40.78)
)
cmr_items = search_results.get_all_items()
for item in earthdata_items:
print(item.id)
print(item.properties)
Without the eo:cloud_cover
query parameter used, the search now results the following: `
HLS.L30.T18TWL.2021039T153324.v2.0 {'datetime': '2021-02-08T15:33:24.028Z', 'start_datetime': '2021-02-08T15:33:24.028Z', 'end_datetime': '2021-02-08T15:33:47.911Z', 'eo:cloud_cover': 6} HLS.L30.T18TWL.2021055T153318.v2.0 {'datetime': '2021-02-24T15:33:18.868Z', 'start_datetime': '2021-02-24T15:33:18.868Z', 'end_datetime': '2021-02-24T15:33:42.759Z', 'eo:cloud_cover': 97} `
Here's a nb that demonstrates the problem and how this problem is not present when using the eo:cloud_cover query parameter with AWS Earth Search
https://notebooksharing.space/view/7e63f879ff1bad1d8a838e568cdcd67f6a5f17b17a7394ab99dd8f531f89f5fa#displayOptions=
after discussion with @sharkinsspatial it looks like there's some tricky stuff going on
AWS Earth Search supports the query extension ~a non standard, out of spec way to filter eo:cloud_cover for some reason. query=["eo:cloud_cover<20"]
shouldn't work for any stac catalog~
this is the STAC spec way to do filter without the query extension
search= client.search(
collections=[collection],
intersects=point,
datetime='2020-03-20:00:00:00Z/2020-03-30:00:00:00Z',
max_items=10,
query={
"eo:cloud_cover": {
"lt": 20
}}
)
len(search.get_all_items())
however this is still broken for CMR, noted in this issue
@jaybarra sorry to ping but is there a timeline for fixing #206 ? We are trying to show the public a modern way to access NASA data via the CMR STAC and would like to show a solution that allows them to filter by cloud cover and other eo: properties https://carpentries-incubator.github.io/geospatial-python/05-access-data/#solution-2
@rbavery Note that the alternate syntax for query isn't a feature of the Earth-Search API, but is actually a feature of pystac-client. See the docs here: https://pystac-client.readthedocs.io/en/latest/usage.html#query-extension
pystac-client converts the shortcut syntax into STAC Query JSON, so would work for any API that supports the Query extension.
(Comment from Alicia Aleman):
Query syntax is incorrect.