intake-stac icon indicating copy to clipboard operation
intake-stac copied to clipboard

Add a `StacCollection.search` method

Open TomAugspurger opened this issue 4 years ago • 6 comments
trafficstars

Purely as a convenience, it'd be nice to have a StacCollection.search method that uses pystac-client to search an endpoint with a specific collection.

cat = intake.open_stac_catalog("/path/to/catalog")
collection = cat["my-collection"]
collection.search(bbox=bbox)

The .search method would use pystac-client

  1. Find the link with a "rel": "search". Set that as the endpoint
  2. Specify collections=[self.id], to limit the search to just that collection.

I see now that intake's base Catalog class apparently defines a search method, which appears to do some kind of text-based search on the items. I suspect that most STAC users would expect search to behave like STAC search.

TomAugspurger avatar Jun 20 '21 15:06 TomAugspurger

This sounds great @TomAugspurger! I personally don't see any good reason to avoid overriding the base class search method but we should ask @martindurant for his thoughts.

jhamman avatar Jun 21 '21 16:06 jhamman

Please do make specialised versions of search(), the one in Catalog is super-simplistic and only meant to be a fallback when nothing better is available.

martindurant avatar Jun 21 '21 16:06 martindurant

The .search method would use pystac-client

@TomAugspurger I really like this idea, but it would need some docs / error handling for the cases that "rel":"search" doesn't exist.

I'm thinking of the case of a static catalog/collection, where there is no API endpoint. For that case we could:

  1. Stick with default intake.search() that I think just does some string pattern matching.

  2. Or implement a simple 'local api' search with the same keywords that to filter by bbox and datetime (e.g. geopandas operations on the static catalog represented as a GeoDataFrame https://github.com/intake/intake-stac/issues/36). @matthewhanson probably has some ideas on performance here, and things would likely be quite slow if someone tries this on a really big catalog.

scottyhq avatar Jun 21 '21 17:06 scottyhq

I do rather like the idea of being able to do a search on a static catalog, but that seems like it should be implemented in pystac-client as well (which is named as such as it's a client for both static catalogs and APIs).

Currently pystac-client will raise an APIError if there is no rel=search link

matthewhanson avatar Jun 21 '21 19:06 matthewhanson

I have done a bit of work putting together an implementation of pystac-client style search for static catalogs. This work lives in https://github.com/jsignell/stac-static it could be possible to delegate search to that library. The main limitation is that it depends on having a geodataframe version of the stac metadata though.

jsignell avatar Sep 21 '23 13:09 jsignell

The main limitation is that it depends on having a geodataframe version of the stac metadata

I am not in a good position to reckon how much of a limitation that is.

martindurant avatar Sep 25 '23 15:09 martindurant