support STAC specification
Implement the STAC API specification to support search/discovery of geospatial assets. Notes for implementation based on initial discussions with @matthewhanson :
- add front end routes (
/stac/search) - update
pygeoapi.apito address request handling - implement STAC provider backend via plugin mechanism which will interrogate backend and return results as a Python dictionary for marshalling to JSON proper to the client
- add stubs for transactional capability
- populating a STAC backend could be via workflow beyond pygeoapi (i.e. implement a CLI within the backend, which is not hooked into pygeoapi tooling proper but can be run offline just the same)
I'm curious to see how stac, ogcapi-coverage and ogcapi-records operate together on a single endpoint, what aspects they can share and where are the challenges, would be good input for the upcoming sprint
Implement the STAC API specification to support search/discovery of geospatial assets. Notes for implementation based on initial discussions with @matthewhanson :
- add front end routes (
/stac/search)- update
pygeoapi.apito address request handling- implement STAC provider backend via plugin mechanism which will interrogate backend and return results as a Python dictionary for marshalling to JSON proper to the client
Are you considering external libraries for marshalling? Or we need to implement our own?
- add stubs for transactional capability
- populating a STAC backend could be via workflow beyond pygeoapi (i.e. implement a CLI within the backend, which is not hooked into pygeoapi tooling proper but can be run offline just the same)
Are you considering external libraries for marshalling? Or we need to implement our own?
Good point. I'm guessing a STAC backend could provided via one of the sat-utils tools (for example) and a STAC backend's mission would be to provide Python dict's of JSON objects back to pygeoapi.api, but this remains to be seen/needs to be further tested.
Are you considering external libraries for marshalling? Or we need to implement our own?
Good point. I'm guessing a STAC backend could provided via one of the sat-utils tools (for example) and a STAC backend's mission would be to provide Python
dict's of JSON objects back topygeoapi.api, but this remains to be seen/needs to be further tested.
Ok that makes sense, thanks @tomkralidis.
@tomkralidis Should stac/search endpoint be optional in the configuration? I would say yes...
Would this depend on how we describe in configuration? Like, is STAC a dataset in config? Other options?
Any guidance from the stac team? Is stac intended to run along side ogc api’s in a single ogc-api endpoint, or does it require it’s own endpoint, in that case maybe deploy a second instance of pygeoapi in a ‘stac’ modus?
@pvgenuchten I would consider its own endpoint as suggested in the bullet above from @tomkralidis (cc @matthewhanson)
Configuration could be something like:
catalogs:
sat-api:
provider:
name: STAC
data: https://sat-api-dev.developmentseed.org/stac
@francbartoli is the thought that STAC catalog providers would be their own provider architecture (i.e. separate from dataproviders), or that STAC would be a quality of existing data providers? If an elasticsearch backend, for instance, was loaded with STAC Items (perhaps marked in the dataset configuration), then some STAC-specific capabilities could be enabled.
To comment on the above comment:
Is stac intended to run along side ogc api’s in a single ogc-api endpoint, or does it require it’s own endpoint, in that case maybe deploy a second instance of pygeoapi in a ‘stac’ modus?
My understanding (which is a bit weaker, since I mostly work with static STACs) is that STAC API contains some additional endpoints:
-
/stac- Simply gets the root catalog. -
/stac/search- Implemented so that STAC can do more advanced queries via extensions than what OAF currently supports
The idea would be that eventually, with the convergence of the Query/Filter extensions into OAF, the second endpoint would go away.
@matthewhanson could provide more info as I'm basically summarizing what I heard from him yesterday at the STAC sprint.
There is currently a PR up to change those endpoints: https://github.com/radiantearth/stac-spec/pull/632
The /stac endpoint would go away because it's redundant with the root endpoint / - it just returns a STAC catalog, which is the same thing that the root OAF endpoint returns with some additional fields.
/stac/search endpoint is proposed to be be renamed to /items and proposed to OAF as a general cross-collection search endpoint. However, this wouldn't go in until OAF 1.1.
Thanks @matthewhanson, so in the meantime, we could adopt /items but for users that might be a bit confusing to understand if it is not part yet of the OAF spec. And we don't know when it will land there
Right, not sure when it will land, but now it's agreed it's going to be /search not /items
@francbartoli is the thought that STAC catalog providers would be their own provider architecture (i.e. separate from dataproviders), or that STAC would be a quality of existing data providers? If an elasticsearch backend, for instance, was loaded with STAC Items (perhaps marked in the
datasetconfiguration), then some STAC-specific capabilities could be enabled.
@lossyrob do you mean something like this below (looking at earth-search)?
datasets:
cbers4-awfi:
title: CBERS 4 AWFI Imagery
description: CBERS 4 AWFI Imagery
keywords:
- stac
- stac-api
- assets
links:
- type: application/json
rel: collection
title: information
href: https://earth-search.aws.element84.com/collections/cbers4-awfi
hreflang: en-US
extents:
spatial:
bbox: [-180,-90,180,90]
crs: http://www.opengis.net/def/crs/OGC/1.3/CRS84
temporal:
begin: null
end: null # or empty (either means open ended)
provider:
name: STAC
data: # borrow data architecture from OGR provider
source_type: ES
source: ES:http://localhost:9200/cbers4-awfi
we are then implicitly saying that cbers4-awfi is a collection but at some point losing the knowledge of being a specific stac one. I mean at least from an OAPIF perspective.
On the other hand, we could have a dedicated architecture like:
catalogues:
hello-catalogue:
type: OAPIC (CAT4)???
sat-api:
type: STAC
provider:
name: STAC
datasets:
cbers4-awfi:
title: CBERS 4 AWFI Imagery
description: CBERS 4 AWFI Imagery
keywords:
- stac
- stac-api
- assets
links:
- type: application/json
rel: collection
title: information
href: https://earth-search.aws.element84.com/collections/cbers4-awfi
hreflang: en-US
extents:
spatial:
bbox: [-180,-90,180,90]
crs: http://www.opengis.net/def/crs/OGC/1.3/CRS84
temporal:
begin: null
end: null # or empty (either means open ended)
provider:
name: STAC
data: # borrow data architecture from OGR provider
source_type: ES
source: ES:http://localhost:9200/cbers4-awfi
Here the concept of collection is nested in the specific provider type. Other options @tomkralidis @pvgenuchten ?
To comment on the above comment:
Is stac intended to run along side ogc api’s in a single ogc-api endpoint, or does it require it’s own endpoint, in that case maybe deploy a second instance of pygeoapi in a ‘stac’ modus?
My understanding (which is a bit weaker, since I mostly work with static STACs) is that STAC API contains some additional endpoints:
/stac- Simply gets the root catalog./stac/search- Implemented so that STAC can do more advanced queries via extensions than what OAF currently supportsThe idea would be that eventually, with the convergence of the Query/Filter extensions into OAF, the second endpoint would go away.
@matthewhanson could provide more info as I'm basically summarizing what I heard from him yesterday at the STAC sprint.
Perhaps /search as the cross collection search reuses the provider plugin approach and is specified like:
catalogues:
landsat8-aws:
type: STAC
title: Landsat 8 AWS catalog
description: Landsat 8 AWS catalog
keywords:
- landsat
links:
- type: text/html
rel: canonical
title: information
href: https://registry.opendata.aws/landsat-8/
hreflang: en-US
extents:
spatial:
bbox: [-180,-90,180,90]
crs: http://www.opengis.net/def/crs/OGC/1.3/CRS84
temporal:
begin: 2013-03-18
end: null # or empty (either means open ended)
provider:
name: Elasticsearch
data: http://localhost:9200/landsat-aws/FeatureCollection
id_field: ID
and then /search can is routed to reuse pygeoapi.get_collection_items. In the /search case, collections is a query parameter. So we can either consider searching every endpoint in catalogues in the config, or having a single catalogue with a required collection property that can be queried against. The former would be tricky as to how to return multi-collection results in a single FeatureCollection.
Thoughts?
Considering https://github.com/radiantearth/stac-spec/pull/632#issuecomment-550350731, i imagine this method will search/browse through a server in a google type of way: a list of 3 datasets, 5 catalogrecords and 2 grids. I like it. From the current discussion I get the feeling that the stac team actually wants to see stac being made available embedded in a/the OAPI endpoint (and not separately).
Sorry of my unawareness about stac, am i getting it correctly that stac exposes a queryable series of metadata records of sensor observations (imagery) at a given time/location? A client will then be able to extract the relevant fraction of a cloud-optimised-geotiff (or alternative source)? To me these cases seem quite similar to what others are designing in OAPI-records, sensorthings and/or OAPI-coverage, so either very likely to collide (separate endpoint +1) or on the other hand this could be an opportunity to engage with those teams and design a shared model (embedded +1)
looking forward to hear your thoughts/ideas
WIP in https://github.com/geopython/pygeoapi/tree/stac . Notes:
- code basically re-uses
/collections/itemslogic along with a filter JSON payload (currently does nothing), and detects/searchin order to querycataloguesobjects / backends in config - the concept of a default or cross collection search still to be determined. Specifying
collectionsworks, albeit against a single collection atm. If we have 1..ncataloguesobjects defined in pygeoapi, how would a cross collection search work? If we assume, for example, that all catalogues are backed by something like ES, then one can do cross index searching. Else, we could define a single catalogue in a pygeoapi instance in which all documents to be searched are in that single index, which would work, but not very pragmatic
Note the STAC example here is based on Landsat 8 AWS (tooling hacked together at https://gist.github.com/tomkralidis/3b6263ec9fbd84e6b50d79527dda149f to setup a basic ES index.
In geonetwork we deploy a specific instance of elastic search for this use case; metadata records, as well as content from WFS's is indexed in that instance to facilitate cross CSW/WFS search. An administrator indicates which WFS's to crawl.
This approach could also be a relevant for pygeoapi. In the case of csv/shapefiles pygeoapi could operate against the index for many operations, which would benefit performance.
If an index like elastic would become such an essential component, it would be good to facilitate an abstraction layer, so a user could select his favourite index (or database) to provide such functionality (SOLR, Noise, PostGIS)
@francbartoli I'm a bit unclear what the best path is on the configuration side, but I think that's due to my lack of familiarity of pygeoapi. Tom's WIP branch looks like it's on the right track though!
Update: current work in https://github.com/geopython/pygeoapi/tree/stac
FYI functionality merged in #389. Keeping open for STAC API implementation.
@tomkralidis
Any news on the implementation of the /search endpoint?
the stac branch doesn't seem to exist anymore, but apparently there was some WIP toward adding this functionality.
@ricardogsilva in the stac branch there was a basic Elasticsearch provider which became dated. With OGC API - Records evolving, we decided to wait on implementing STAC API until it becomes more clear on how OARec will relate to stac /search.
Hi All,
I hate to dig up an old post -- but has any /search feature been added (i.e. like https://stacspec.org/STAC-api.html#operation/getSearchSTAC)? We have just setup pygeoapi, and it seems to still not be available.
Thanks!
@gnosys-tmiller I have a pending branch/PR to implement STAC API, which should be completed in the next 2 weeks or so. cc @cholmes.
Hey @tomkralidis is your WIP allowing an existing STAC API to be browsed from with pygeoapi, or for pygeoapi itself to act as a STAC API?
Also, this is labeled "help wanted" - what can be done to help? :wink:
I to am interested in the intersection of STAC and pygeoapi - any links to what support can be offered? Happy to dig in and help.
Any progress here @tomkralidis -- can we lend a hand getting this over the finish line?
As per RFC4, this Issue has been inactive for 90 days. In order to manage maintenance burden, it will be automatically closed in 7 days.
As per RFC4, this Issue has been closed due to there being no activity for more than 90 days.