nationalmap icon indicating copy to clipboard operation
nationalmap copied to clipboard

Surfacing more data from catalogue.data.wa.gov.au

Open keithmoss opened this issue 5 years ago • 6 comments

Hi team,

We're beginning to harvest more third party web GIS services into Western Australia's CKAN and would like to surface them in NationalMap. Historically, you've only been surfacing data from the central SLIP (slip.wa.gov.au) geospatial data services platform.

We've recently harvested a range of spatial services from Main Roads who are hosting data on their own ArcGIS Server instances. There's several other agencies in a similar situation that we're planning to harvest this year, so we'd like to make sure these datasets are going to be able to show up in NationalMap.

To start with, can you remind us how much control we have over the config of the package_search query NatMap sends us[1]?

[1] https://nationalmap.gov.au/proxy/_1d/http://catalogue.beta.data.wa.gov.au/api/3/action/package_search?rows=100000&sort=metadata_created%20asc&start=0&q=data_homepage%3A*_Public_Services*&fq=res_format%3AWMS

Ping @vduong2

keithmoss avatar Apr 30 '19 01:04 keithmoss

@keithmoss the query is fairly configurable. Here's the list of properties that can be configured: https://docs.terria.io/guide/connecting-to-data/catalog-type-details/ckan/

And here is how the WA group is configured: https://github.com/TerriaJS/NationalMap-Catalog/blob/master/datasources/includes/WA.ejs

kring avatar Apr 30 '19 01:04 kring

to investigate: There’s an ‘access_level’ property on our CKAN resources that can be used to filter to only show truly public data, but everything on https://docs.terria.io/guide/connecting-to-data/catalog-type-details/ckan/ indicates that Terria is operating on package/dataset level, not resource level. Is that accurate?

AnaBelgun avatar Jul 08 '19 05:07 AnaBelgun

@keithmoss We're using CKAN's package_search, which allows querying, filtering and sorting datasets on dataset attributes. From there we add each dataset and it's resources if they match the parameters resource format settings you gave (includeWms, wmsResourceFormat, etc.)

tephenavies avatar Jul 08 '19 06:07 tephenavies

@steve9164 Am I right in saying there's nothing in that mix to let us further filter the resources that go into NatMap? That is, once NatMap gets a package in the package_search response it assumes that all of its resources are going to be accessible to the user and should be added?

e.g. https://catalogue.data.wa.gov.au/dataset/bush-forever-areas-2000-dop-071 has 7 resources

  • 5 are restricted to authorised users who are logged in (access_level == "govt_only"`)
  • 2 are public resources open to anonymous users ( (access_level == "open"`)

So if we could (in theory) further filter the package_search response to the the 2 public resources that are actually relevant to NatMap by looking for resources[*]["access_level"] == "open" we'd be able to avoid presenting the user with what look like duplicates that ask them to login.

keithamoss avatar Jul 09 '19 07:07 keithamoss

@keithamoss That is correct. Terria does a bit of resource filtering to try to list all and only the resources it can display but it doesn't have a way to filter by "access_level".

tephenavies avatar Jul 10 '19 07:07 tephenavies

it might be possible with a q= or fq=, but we're not sure. @keithamoss if you can help us in figuring it out it's easy to make terria do it

AnaBelgun avatar Jul 11 '19 01:07 AnaBelgun