pygeoapi icon indicating copy to clipboard operation
pygeoapi copied to clipboard

sparql query as a source for a wfs3 collection

Open pvgenuchten opened this issue 5 years ago • 5 comments

This issue wants to capture some of the gitter discussion we had at the wfs3 hackathon in nl.

There is generally 2 use cases for wfs

  1. organisations with an ambition to publish some data on the web as a webservice
  2. organisations having services online, but wanting to expand their audience by exposing to the geo community

In 1) the published concept is unique, In 2) the published feature is a representation of an object published elsewhere (it should use the external uri as id?).

My expectation is that we'll see more of 2) in future. We can then also think about having a wfs3 endpoint on the rdf sources, so it exposes the same content. This may be facilitated by having pygeoapi exposing the result of a sparql query on a triple store as wfs3.

Maybe we can use someting like https://rdflib.github.io/sparqlwrapper/ to retrieve the data?

pvgenuchten avatar Jun 05 '19 09:06 pvgenuchten

@ksonda and I have been discussing something very similar in the context of work on the geoconnex.us and geoconnex.ca feature repositories.

The use case we have is where we want to add rich JSON-LD to a feature landing page and provide JSON-LD and other semantic formats for the feature its self. In this case, pyGeoAPI is hosting features that are identified by persistent URIs -- we want to query a triple store / graph database for linked data related to the persistent URI and template that content into an HTML header as well as some stand alone media types.

My with PlantUML fascination is maybe becoming problematic, but here's something visual to describe the concept.

image

@startuml

package "App-Server" {
  [pyGeoAPI]
  [Linked\nData]
}

[Triple Store] - [Linked\nData] : SPARQL

[Spatial Features] - [pyGeoAPI] : SQL

database "Triple Store" {
}

database "Spatial Features" {
}

[pyGeoAPI] --> HTTP

[Linked\nData] ... [pyGeoAPI] : URI

@enduml

dblodgett-usgs avatar Sep 24 '20 20:09 dblodgett-usgs

Not sure if this should be combined with #615 but the use case seems different enough.

~~Our idea is to use the capabilities introduced by #676 to use pygeoapi to publish lightweight geospatial collections that include URI attributes.~~ This isn't actually necessary

The URI could then be used by a SPARQL provider to pipe in predicate-object pairs as feature properties into the item-level pygeoapi responses, with configurable list of predicates and the variable names they would take in the pygeoapi feature properties.

There thus needs to be some kind of way to specify a feature collection with both a spatial feature and SPARQL provider. Whether its possible to simply add a SPARQL provider to a given resource or a custom combination providers like "CSV-SPARQL", "SQLite-SPARQL", etc would need to be written, we're willing to explore, even if they never get added as core plugins.

As an example, let's say we have have a CSV:

city uri lat lon
Berlin http://dbpedia.org/resource/Berlin 52.5200 13.4050
Paris http://dbpedia.org/resource/Paris 48.8566 2.3522

a pseudo configuration could be something like

providers:
    - type: feature
      name: CSV
      data: tests/data/cities.csv
      id_field: city
      geometry:
          x_field: lon
          y_field: lat
      sparql_endpoint: https://dbpedia.org/sparql
          sparql_uri_field: uri
          predicates:
             pop: http://dbpedia.org/ontology/populationTotal
             country: http://dbpedia.org/ontology/country
          

And then the resulting response for example.com/collections/cities/items/Berlin?f=json would be:

{
  "type": "Feature",
  "geometry": {
    "type": "Point",
    "coordinates": [
     13.4050,
     52.5200
    ]
  },
  "properties": {
    "uri": "http://dbpedia.org/resource/Berlin",
    "pop": 3769495,
    "country": "http://dbpedia.org/resource/Germany"

  },
  "id": "Berlin"
}

This about right @dblodgett-usgs ?

See here @webb-ben

ksonda avatar May 06 '21 18:05 ksonda

Yeah -- that's the concept. I'm very open to a more elegant implementation (GeoSPARQL?) but it seems like a simple way to inject the results of a basic SPARQL query would be really nice.

dblodgett-usgs avatar May 07 '21 02:05 dblodgett-usgs

this seems more dependent on what we want the knowledge graph to do than how to make pygeoapi spit data out of a sparql endpoint.

IF we want the triple store to do geometric inferencing THEN

(we need to support geosparql geometry and predicates AND (we need data providers to publish geometry in jsonld as geosparql OR our crawler needs to transform schema:geo type geometry to geosparql on ingest))

OR our crawler needs to do st_intersect on ingested features and minimal set of reference features and add topology triples explicitly

IF we dont want the triple store to do anything like that, or we implement with non-geosparql methods, then we need a pure sparql provider with the geometries coming from elsewhere, as in your original soggestion

ksonda avatar May 07 '21 04:05 ksonda

@dblodgett-usgs @ksonda I believe that what I coded last year matches you requirements. Check the proposed config file here.

ldesousa avatar May 10 '21 08:05 ldesousa

As per RFC4, this Issue has been inactive for 90 days. In order to manage maintenance burden, it will be automatically closed in 7 days.

github-actions[bot] avatar Mar 10 '24 21:03 github-actions[bot]

As per RFC4, this Issue has been closed due to there being no activity for more than 90 days.

github-actions[bot] avatar Mar 24 '24 03:03 github-actions[bot]

Note: version of SPARQL provider in cgs-earth/pygeoapi-plugins. @ldesousa Happy to integrate GeoSPARQL into this separate repo as well.

webb-ben avatar Mar 25 '24 22:03 webb-ben