ckanext-dcat
ckanext-dcat copied to clipboard
Expose dataset metadata as JSON-LD for Google Dataset Search
Hi everyone!
I installed the DCAT extension on CKAN 2.9 and I would like to expose my datasets to Google Dataset Search by using the JSON-LD endpoint. I am using the ckanext-scheming extension in order to create customized fields.
I would like to have something like in this example here.
I am particularly interested in the temporalCoverage and spatialCoverage fields. I implemented them in my schema file but I don't see them in the JSON-LD file created by the DCAT extension. I probably missed something.
Here is how I implemented my temporalCoverage field:
{
"field_name": "temporals",
"label": "Temporal coverage",
"repeating_subfields": [
{
"field_name": "startDate",
"label": "Start Date",
"display_property": "schema:startDate"
},
{
"field_name": "endDate",
"label": "End Date",
"display_property": "schema:endDate"
}
]
}
Can anyone help?
@maxclac Looks like the SchemaOrg profile used to generate the JSON-LD snippet that Google Dataset Search will parse expects temporal_start
and temporal_end
to be the field names. Can you try to change the field names to see if that works? If you can't change the names you will need to create a custom profile that parses your fields in a similar way.
Note that last time I worked on this Google Dataset Search it didn't support parsing schema.org JSON-LD from a linked file, it needed to be embedded in the source of the page. That's what the structured_data
plugin does, using the profile I linked above.
Thank you @amercader. I changed the field names but it did not help. I do not need to use schemas from schema.org. Any will do, as long as I can have my fields in the JSON-LD file. If I could manage this without having to write my own custom profile, it would be great, because I don't have enough understanding of CKAN yet to be able to do this.
Update: it works when I put startDate and endDate as two separate fields and not as subfields of temporals:
{
"field_name": "temporal_start",
"label": "Start Date",
"display_property": "schema:startDate"
},
{
"field_name": "temporal_end",
"label": "End Date",
"display_property": "schema:endDate"
}
I think that I simply need to go through this SchemaOrgProfile and see what it expects as field names and maybe modify it according to my needs.