ckanext-dcat icon indicating copy to clipboard operation
ckanext-dcat copied to clipboard

RDF output for distributions broken

Open letmaik opened this issue 10 years ago • 6 comments

All distributions of a dataset are merged together in a single Distribution. An example: http://ckan-demo.melodiesproject.eu/dataset/test-dataset-for-testing-distributions.rdf

This is a severe bug and happens in all RDF output formats. What's going on here?

letmaik avatar Oct 27 '15 11:10 letmaik

http://ckan-demo.melodiesproject.eu/api/action/package_show?id=test-dataset-for-testing-distributions

All resources have an uri field with value None. That why they are all considered to be the same distribution. We need to know where this None came from.

Did you create the dataset manually, was it harvested? if so with which harvester and from where?

amercader avatar Oct 27 '15 11:10 amercader

Yes, harvested with this extension from https://github.com/ec-melodies/wp02-dcat/blob/master/WP2.jsonld

The distributions don't contain an ID, however I don't see an issue with that.

Am 27.10.2015 um 11:39 schrieb Adrià Mercader:

http://ckan-demo.melodiesproject.eu/api/action/package_show?id=test-dataset-for-testing-distributions

All resources have an |uri| field with value |None|. That why they are all considered to be the same distribution. We need to know where this |None| came from.

Did you create the dataset manually, was it harvested? if so with which harvester and from where?

— Reply to this email directly or view it on GitHub https://github.com/ckan/ckanext-dcat/issues/46#issuecomment-151464036.

letmaik avatar Oct 27 '15 11:10 letmaik

This sounds indeed like a bug on the URI parsing on the processors or the functions that guesses the URIs when serializing, I'll have a look when I have a chance but any head start on your side is appreciated.

amercader avatar Oct 27 '15 12:10 amercader

Relevant code parts: https://github.com/ckan/ckanext-dcat/blob/156ef8cadf288228630581d3302c5522d81f27d1/ckanext/dcat/profiles.py#L900 https://github.com/ckan/ckanext-dcat/blob/156ef8cadf288228630581d3302c5522d81f27d1/ckanext/dcat/utils.py#L93

Looking at the resource_uri() code, the only way the ID can end up as "None" is when uri = resource_dict.get('uri') actually returns a string with "None", which would then be a bug outside this plugin I guess. Could that be the case?

letmaik avatar Oct 28 '15 15:10 letmaik

Just debugged it a bit, and I was right, resource_dict['uri'] is actually the string "None". Question is, where does that come from?

letmaik avatar Oct 29 '15 11:10 letmaik

When I request the dataset through the CKAN API, I also get it back as "uri":"None". I opened an issue in the main CKAN repo: https://github.com/ckan/ckan/issues/2716

letmaik avatar Oct 29 '15 11:10 letmaik