pystac-client icon indicating copy to clipboard operation
pystac-client copied to clipboard

Support for STAC API - Collection Search

Open hrodmn opened this issue 1 year ago • 7 comments

It can be difficult for a user to identify which collection they want to query from a STAC before they begin searching for items. I have been thinking a lot about improving the ergonomics of collection discovery lately while working on a tool for federated collection discovery. Most of the code in that project is just a mechanism for crawling through the collections returned by the /collections endpoint and checking to see if they match the provided search criteria.

The STAC API - Collection Search extension is intended to provide an API endpoint for filtering collections based on some criteria. It is not implemented widely yet but it enriches the collection discovery process significantly when paired with a client application like this STAC Browser example.

What needs to happen to add a collection_search method to the pystac.Client?

hrodmn avatar Aug 05 '24 14:08 hrodmn

While you're correct that there is an extension for collection search, it's a bit out-of-date (e.g. it references v1.0.0-rc.1 of the STAC API spec, itself is v1.0.0-rc.1, and is pilot maturity). I see from https://github.com/stac-api-extensions/collection-search/commit/4ad94f2b73b8a240d32328574367f3b2073fcd05 that there are two implementations, which helps — if they're public, that would provide us APIs to write tests against.

I think the TODOs to include collection search in pystac-client would be:

  • [ ] Identify one or more public API instances to write tests against
  • [ ] Create a Client.collection_search that would return a CollectionSearch instance, the same way that Client.search returns an ItemSearch
  • [ ] Build out CollectionSearch, which would likely share a lot of behavior w/ ItemSearch
  • [ ] (optional but desirable) Release v1.0.0 of the collection search extension and get a public API to update to new release

@m-mohr you've touched collection search stuff more than I have, any additional thoughts?

gadomski avatar Aug 05 '24 15:08 gadomski

The only public implementation that I am aware of is https://emc.spacebel.be/

There is already a collection_search function in pgstac, and I am working on https://github.com/stac-utils/stac-fastapi-pgstac/pull/136. Once that's stable I intend to deploy it to some public APIs that Development Seed maintains.

hrodmn avatar Aug 05 '24 15:08 hrodmn

Collection Search is there to stay, fastapi has an implementation, STAC Browser, too.

Good point that rc.1 of the API is referenced. Please open an issue for it (have to run). Thanks.

m-mohr avatar Aug 05 '24 15:08 m-mohr

@m-mohr issue: https://github.com/stac-api-extensions/collection-search/issues/16

gadomski avatar Aug 05 '24 16:08 gadomski

:wave: @gadomski - I would like to get started on this feature sometime soon!

Since most STAC APIs will not have the collection-search extension enabled until it is fully implemented in the common STAC API frameworks and the updates are deployed, what would you think about adding collection filtering capability to pystac-client in the meantime?

I hacked together a system for filtering results from the /collections endpoint in the federated collection discovery repo. It is not pretty but it makes it possible to perform a collection search by iterating through the pages returned by /collections and keeping collections that overlap with the search terms.

hrodmn avatar Sep 13 '24 19:09 hrodmn

Since most STAC APIs will not have the collection-search extension enabled until it is fully implemented in the common STAC API frameworks and the updates are deployed, what would you think about adding collection filtering capability to pystac-client in the meantime?

I think it makes sense, maybe with a warning so the user knows that they're doing things "the hard way" (i.e. client-side).

gadomski avatar Sep 14 '24 09:09 gadomski

... also if it's paginated, users should be made aware that the result is probably incomplete...

m-mohr avatar Sep 14 '24 11:09 m-mohr