intake-stac
intake-stac copied to clipboard
Add a `StacCollection.search` method
Purely as a convenience, it'd be nice to have a StacCollection.search method that uses pystac-client to search an endpoint with a specific collection.
cat = intake.open_stac_catalog("/path/to/catalog")
collection = cat["my-collection"]
collection.search(bbox=bbox)
The .search method would use pystac-client
- Find the
linkwith a"rel": "search". Set that as the endpoint - Specify
collections=[self.id], to limit the search to just that collection.
I see now that intake's base Catalog class apparently defines a search method, which appears to do some kind of text-based search on the items. I suspect that most STAC users would expect search to behave like STAC search.
This sounds great @TomAugspurger! I personally don't see any good reason to avoid overriding the base class search method but we should ask @martindurant for his thoughts.
Please do make specialised versions of search(), the one in Catalog is super-simplistic and only meant to be a fallback when nothing better is available.
The .search method would use pystac-client
@TomAugspurger I really like this idea, but it would need some docs / error handling for the cases that "rel":"search" doesn't exist.
I'm thinking of the case of a static catalog/collection, where there is no API endpoint. For that case we could:
-
Stick with default intake.search() that I think just does some string pattern matching.
-
Or implement a simple 'local api' search with the same keywords that to filter by bbox and datetime (e.g. geopandas operations on the static catalog represented as a GeoDataFrame https://github.com/intake/intake-stac/issues/36). @matthewhanson probably has some ideas on performance here, and things would likely be quite slow if someone tries this on a really big catalog.
I do rather like the idea of being able to do a search on a static catalog, but that seems like it should be implemented in pystac-client as well (which is named as such as it's a client for both static catalogs and APIs).
Currently pystac-client will raise an APIError if there is no rel=search link
I have done a bit of work putting together an implementation of pystac-client style search for static catalogs. This work lives in https://github.com/jsignell/stac-static it could be possible to delegate search to that library. The main limitation is that it depends on having a geodataframe version of the stac metadata though.
The main limitation is that it depends on having a geodataframe version of the stac metadata
I am not in a good position to reckon how much of a limitation that is.