rdf-validate-shacl icon indicating copy to clipboard operation
rdf-validate-shacl copied to clipboard

Shape inheritance only works when single dataset is used

Open tpluscode opened this issue 4 years ago • 5 comments

The validator supports shape hierarchies using rdfs:subClassOf but they are not applied correctly if the data graph and shapes graph are separate instances of Dataset

Consider this example.

The result is false-positive, even though the validated instance should be validated as foaf:Agent.

The correct results is returned if a single dataset is populated, for example using actual named graphs:

const dataset = $rdf.dataset()
const dataGraph = clownface({ dataset, graph: $rdf.namedNode('data-graph') })
const shapesGraph = clownface({ dataset, graph: $rdf.namedNode('shapes-graph') })

tpluscode avatar Apr 18 '21 15:04 tpluscode

For inheritance to work, your "ontology" (rdfs:subClassOf) must be included in the "data" dataset. I think it makes sense, doesn't it?

martinmaillard avatar Apr 19 '21 06:04 martinmaillard

No, I don't think it does. Class hierarchy is not property of data itself but of the its meta model (classes).

Holger suggest to run the validation over the union graphs which does work already as I mention above. The only problem is when the two RDF/JS datasets are separated.

I would consider merging them in memory (maybe opt-in, to prevent copying large amounts of data). That would be transparent to the caller, whether they use one or two datasets as input.

tpluscode avatar Apr 19 '21 07:04 tpluscode

I like the idea of merging the datasets inside the validator, but I'm afraid of the consequences, mostly regarding blank nodes.

We're back to the same considerations: if we can agree that users should not expect blank nodes to keep their ID in the final "validation report" dataset, changing these IDs during validation shouldn't be an issue.

martinmaillard avatar Apr 20 '21 06:04 martinmaillard

I think blank nodes will not be a problem if the "merging" happened in named graphs inside the joint dataset.

The only requirement would be address them as (bnode, graph). If clownface is used, that would already be taken care of by using pointers to specific graph

tpluscode avatar Apr 20 '21 06:04 tpluscode

I tried to evaluate how easy this change would be and I think it will require a major refactoring: mostly passing clownface pointers around instead of nodes without context.

martinmaillard avatar Apr 27 '21 08:04 martinmaillard